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1. IntroductioD 

This paper describes the pn^ramroer's view of SPARCstation-1: address spaces, caching and 
memcHy management, and interrupt levels. It is a synthesis of information contained in the hardware 
specificadons, but(xganized to be us^iil toai^ogrammer. 

Where appropriate, comparisons with die **^andard" Sun4 archiiectuie are made. 

WARNING: This document is a DRAFT and may contain errors. Please report all mistakes to 
the author for anrectiiHi. 

Major Changes Since Draft 1 (Ver^n L7) 

(1) The page size has changed from SK to 4K. 

(2) The size of a physical address has changed from 29 bits to 28 bits. 

(3) The Sbus has moved fron Type spact to Type 1 space, and there has been a major reorganization 
of the Type 1 addresses to accommodate this. 

Changes Since Draft 2 (Vo^oa 2.4) 

'4) Minor typ<^rapfaical and edittxial changes. 

vi) Better ^planadons. 

Changes Since Draft 3 (Versi<Mi 3.7) 

(6) The Interrupt Register is used to clear level 15 intorupts. 

(7) All Sbus devices are now ctescribed using relative offsets. 

(8) More bits are used in the Auxiliary Input^tput Register. (Which used to be die Auxiliary Ouq}ut 
Register.) 

Changes Since Draft 4 (ya*sioo 4.7) 

(9) The intemq)t levels have been changed slighdy. All Sbus devk:es, iiKluding the builtin ones, inter- 
rupt on Sbus IRQ levels (xily. 

(10) TI% Auxiliary Input/OuQNit Register has changed slighdy. 

(1 1) The definition of the DMA Write bit v^ backwards. 

(12) The video subsystem is off die board, again. 

Changes ^ince Draft 5 (Version 5.6) 

Better explanations and addition of more examples. 

Changes Since Draft 6 (Versi<» 6.1) 

(1) Added warning that this is still a DRAFT document and may not be completely accurate. 

(2) Described the bugs in various levels of hardware: 

SyTtChrcaiGus parity enofs cause a^iichFOnous traps (fixed in PI. 7) 

SER records asynchronous errors (won't be fixed) 

ASER and ASEVAR latch on synchronous memory errors (won't be fixed) 

On cache fill errors, SEVAR may not have exact address of problem (won't be fixed) 
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ASER sometimes isn't set on asynchronous entvs (won't be fixed) 
ASEVAR isn*t properly sign-«xtended on DVMA errors (won't be fixed) 

(3) Audio/ISDN replaces Audio DAC. 

(4) Level 8 inienupts can be masked. 

(5) Video goes into slot 3. 

(6) Sbus IRQ6 and IRQ? now map to SPARC level 8 and 9. instead of 9 and 13, respectively. 

(7) Miscellaneous oxrections. 

2. Address Spaces 

The SPARC Architecture defines the existence of at teast 4 address spaces. A given implementation 
may define more than 4 address spaces. Selection of a particular address space is done via the Address 
Space Indicator (ASI) field of the load and store alternate address space instructions. Ordinary load and 
store instructions autranatically go to User or Supervisee Data space, depending upon the mode of the CPU. 
Instruction fetches by the CPU aittomatically go to User or Supervisor Instn^tion spact^ again depending 
upon the mode of the CPU. 

The fc^owing table describes the address spaces defined by the Sun4 Architecuire and the 
SPARCstation-l implementaticm. 



ASI 


Sun4Use 


SPARCstation-1 Use 


Comments 


0x0 


Reserved 


Reserved 




Oxl 


Reserved 


Reserved 




0x2 


System Space 


Same 


Notel 


0x3 


Segment Map 


Same 




0x4 


Page Map 


Same 




0x5 


Block Copy 


Reserved 


Note 2 


0x6 


Region Miqp 


Reserved 


Note2 


0x7 


Flush Cache (Region) 




Note 2 


0x8 


User Instruction 


Same 




0x9 


Supervisor Instruction 


Same 




OxA 


User Data 


Same 




OxB 


Supervisor Data 


Same 




OxC 


Fhish Cache (Segment) 


Same 




OxD 


Fhish Cache (Page) 


Same 




OxE 


Flush Cache (Context) 


Same 




OxF 


Fhish Cache (User) 


Reserved 


Note 3 


0x10 


Flush I-Cache (Segment) 


Reserved 


Note2 


0x11 


Flush I-C:ache (Page) 


Reserved 


Note2 


0x12 


Flush I-Cache (Context) 




Note 2 


0x13 


Flush I-Cache (User) 


Reserved 


Noie2 


0x14 


Flush D-Cache (Segment) 




Note 2 


0x15 


Flush D-Cache (Page) 


Reserved 


Note 2 


0x16 


Rush D-Cacbe (Context) 


Reserved 


Note 2 


0x17 


Flush D-Cache (User) 


Reserved 


Note 2 


OxlB 


^ush I-Cache (Regioo) 


Reserved 


Notc2 


OxlF 


Flush D-Cache (Region) 


Reserved 


Note 2 



Note I. See System Space table (next section) 

Note 2. SPARCstatkm-l has no corresponding function. 

Note 3. This is a change in the specification between Suruise and Sunray. 

User and Supervisor Instruction and Data spaces are collectively known as "Device Space". All 

accesses to Device Space go through the Memory Mangement Unit (MMU). All the other address spaces 

"re collectively known as "Control Space". The non-System Space pcMtions of Control Space all deal 

ith Cache and MMU management, and are discussed in the section on "Contexts, Caching, and the 

MMU". System Space is discussed in the next section. 
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System Space (ASI = 2) 

System Space is a portion ai conool space that is use^i to acxe^ various devices, as the following 
t^le indicates: 



A31:28 


Siiu4 Use 


SPARCstarion-i Use 


Comments 


0x0 


ID Prom 


Reserved 


Notel 


0x1 


Reserved 


Reserved 




0x2 


Reserved 


Reserved 




0x3 


Context Register 


Same 




0x4 


System Enable Regist^ 


Same 




0x5 


R^erved 


ResCTved 




0x6 


Bus Error Register 


Bus Emx' Registers 


Notes 


0x7 


Diagnostic Register 


Unused 


Note 2 


0x8 


(D-)Cache Tags 


Cache Tags 




0x9 


{D=X^heDaia 


Same 


Notes 


OxA 


I-Cache Tags 


Reserved 


Note 4 


OxB 


I-CacheData 


Reserved 


Note 4 


OxC 


Reserved 


Reserved 




OxD 


Reserved 


Reserved 




OxE 


VME Interrupt Vector 


Reserved 


Note 4 


OxF 


Serial Port 


Same 


MMU bypass 



Note 1. SPARC^ation-1 does not have an ID Prom and a timeout will occur. 

Note 2. SPARCstaticHi-l has no diagnostic register but a write to this address will just be ignored and not 

cause a timeout 

Note 3. This is a change in the specificaticxi between Sunrise and Sunray. 

Note 4. SPARCstation-1 has no corresponding function. 

^^ote 5. SPARCstati(Hi-I has four Bus Error Registers, compared to Sun4*s one. 

The Context Roister, Cache Tags, and Cache Data are described in the section on "Ccxitexts, Cach- 
ing, and die MMU'*. The rest of the roisters in System Space are described below. 

3.1. System Enable Register 

The System Enable Register is referenced via byte loads and stor^ at locaticm (ASI=0x2, 
A3 1:28=0x4). It has the following format: 

7 6 5 4 3 2 10 
v v 

INIOiSICIOIRIOIDI 



N ENA NOTBOOT 



ENA SDVMA 
ENA.CACHE 

ENA RESET 



D ENA DIAG 



s all supervisor references go to EPROM 

1 s ncxmal MMU operation 
Reserved (Enables 1/0 Cache in Sun4) 
Is all DVMA is enabled 

1 s Cache enabled 

Reserved (Enables video display in Sun4) 
1 s Reset the System (asserts SBRESET) 
Reserved (Resets VMEbus in Sun4) 
Always (Diagnostic/Monito- in Sun4) 



All bits are initialized to zot) by a reset Setting ENA.RESET to one will cause a reset, and control 
will not be returned to the program that does so; rather, a reboot will occur. Software (or the boot PROM) 
should set ENA_NOTBOOT to one after initializing the MMU. 

1.2. Bus Error Registers 

There are four registers, divided into two sets of two, used to indicate the type and location of bus 
eiTors. One set is for synchronous errors, and the other for asynchronous errors. Synchronous errors are 
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ose that occur due to the execution of the current in^ruction and are repeated to the CPU by a trap at the 
end of that instruction's execution. All errcrs that cannot be associated with the exectuion of the current 
instruction, but are related to such things as DVMA activity, buffered writes, or cache write-back^ are con- 
sidered asynchronous and are teponod via an intemqK on level 15. After servicing the level IS intemqx, it 
is cleared by toggling bit of the Interrupt R^istv. 

Then is an exception to the above rule. On machines prior to the PL7 level, parity errors that occur 
(or any condition that causes SE.MEMERR, described below, to be set) during CPU merocvy accesses 
cause die reporting of both a synchronous and asynchronous cmx. For parity errors that occur during data 
fetches, the data-access trap occurs first and the le^ IS interrupt remains pending. Software may clear the 
level IS interrupt while {vocessing the data-access trap. For parity errors that occur during instructicm 
fetches, the level IS intonqH occurs first and the text-access trap never occurs. Software can distinguish 
true asynchronous ern»s horn instruction fetch errors by maintaining an invalid value in the SEVAR and 
comparing the SEVAR to the ASEVAR on asynchronous errors. If they compare equal, then this is an 
instruction-fetch enor, otherwise it is a true asynchronous error. Software must remember to reload the 
SEVAR with the invalid value after processing sdl synchronous (including instruction-fetch) eirws. 

On P1.7 and kuer boards, memcxy errors during CPU memory accesses only cause the reporting of a 
syiK:hr(Mious enor, a level IS interrupt does not occur. (The asynchrcHious registers still latch on synchro- 
nous memory eixors, however, and must be cleared; see the descriptions of the ASER and ASEVAR, 
below.) 

The Bus Etkx Registers are all fullwtxd in size, althot^ they can be accused via byte, halfword, <x 
fullword loads and stcxes, just as memory is. They reskte at the folk>wing addr^ses in ASIs2 ^»ce: 



Address Descriptioo 



0x60000000 Synchronous Error Register 

0x60000004 SynchnmoQS Error Virtual Address Register 

0x60000008 Asynchronous Error Register 

IxeOOOOOOC Asynchronous Error Virtual Address Register 



Although in normal use the r^;istBrs can be treated as read-only, diey can be written for diagnostic 
purpo^s. 

3.2.1. Synchronous Error Roister 

The Synchronous Error Register (SER) occupies four bytes at locations (ASI=0x2, A3 1:28^0x6, 
A3:0=0x0 to 0x3). Reading any pcmion of the register also clears diat portion. It has die foUowing f(»mat: 

31 23 IS 7 6 5 4 3 2 10 
V --V-- V - V - v 

10 00000000000000 OIRIO OlIIPITIBIMIOiSIWI 

1 - Error during wrto cycle, s read cycle 
1 - Valid Iritwsa zero in a page map entry 
1 s Projection error (see below) 
1 » Non-existent device was addressed 
1 s bus error during Sbus master access 
1 s Memory (pmityoiECQ error 
1 = Inconect size transfer attempted 
W SE.WATCHDOG 1 = Restart due to lU error 

The SER records all errors since it was last cleared. This includes asynchronous errors as well; 
the SER must be read to clear it as part of asynchronous error processing. The SE.WRITE bit 
records die type o( access (read or write) of the last error. 



R 


SE.WRTTE 


I 


SE INVALID 


P 


SE.PROTERR 


T 


SE_TIMEOUT 


B 


SE SBERR 


M 


SE.MEMERR 
SE SIZFRR 


S 



' SPARCstation-1 does not hsve a whte-tMck ache, but if it did it could cause asynchronous errors. 
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A prot^ticMi enor can be caused by an sssssaped write tc a Fead-only page, w by a user-mode access 
to a supervisOT-c»ily page. 

A timeout is repcxted <xi access to a non-existent device, except fcx" accesses to non-existent physical 
memOTv. See the action "Type Q Space," beiow. 

The M^nory Emx* Register must be inspected when a menxKy error occurs, to further isolate the 
cause of the errcv. Note that synchronous memcxy em»s also cause the Asynchnxious Error Register and 
Asynchronous Emx Virtual Address Register to be latched; see the description of these registers betow for 
more informati£»L 

Not all bus errots cause immediate traps. Due to pipelining, the CPU fetches instructions four cycles 
before they will be executed, so it is possible that the CPU will attempt to fetch an instruction that will not, 
in fact, be executed. To inevent ^nnious traps, the CPU does not trap on raemory exceptions until it actu- 
ally iweds to execute the instruction that it was unable to fetch. 

For example, suppose we have the following instruction sequence in virtual memory, where a, b, c, 
etc. represent miscellaneous instructions: 

a 

b 

bz4 label 

d 

— page boundary 
e <~this page is marked invalid 

f 

g 

— page boundary 
label' <~this page is valid 

X 

y 

z 

Tliese instxucti(»is will advance through the pipeline as follows: 

Tmie 12 3 4 5 

Fetch d - X y z 

Decode bz d - x y 

Execute b bz d - x 

Write a b bz d - 

At time (2), the CPU wants to fetch e but the page is marked invalid, so the invalid bit is set in the SER and 
the instruction address is set in the SEVAR. However, the branch (if taken) means that e is never needed, 
so that it would be inconect for the CPU to trap on a page fault (hie to the attempt to fetch e. 

Now let's examine the following sequence: 

a 

b 

st som^hing to a read-only page 

d 

— pag^ boundary 

e f <-this page is marked invalid 
f 
g 
The pipeline now looks as follows: 

Time 12 3 4 5 

Fetch d - - X y 

Decode st d - - x 

~xecute b st d - 

.rite a b st - - 



DRAFT Version 83, 89/06/10 Page 5 



Sun Coafidential SPARCsUtion-1 Programmer's Model DRAFT 



le attempt to fetch e firom an invalid page at time (2) will turn on the SE.INVALID bit in the SER, but 
the CPU will not take an instruction access excqnion until it actually needs to execute e, at time (S). The 
su>re to a read-only page at time (3), however, does result in an immediate data access exception, and the 
CPU will find both die SE.INVALID bit said the SE.PROTERR bit on in the SER. (The exception results 
in a flush of the pipe, and instruction d never dees get to die Write stage in stq;) (4)). 

A similar scenario, where die sttxe is rq>laced by a branch (in user mode) to a supnvisw-only page, 
can result in multii^e Ixts being on for instruction access exceptions. 

It is up to die sc^tware to determine die true cause of die exception when multiple bits are on in the 
SER. H^e is one als^dun: 

SEVAR s getsevarO; 

SER s SERsave - getserQ: 

SER &= -(SE_WRnE I SE.WATCHDOG); 

if (data access excqption) 

error.addr = SEVAR; 
else if (instruction access exception) 

nror.addr « old PC; 
else 

/* CANT HAPPEN*/; 

if(SER&(SER-l))( 

/* multiple bits on; must manually pn^ die PME */ 
pme s geqmie<eniOT_addr); 
if (pme valid) { 

if ((SER & SE.PROTERR) && (pme denies access)) { 

SER = SE.PROIERR; 
} else 

SER &= -(SE_PROTERRISE_INVALID); 
} else 

SERaSE INVALID; 
) 

/* 

* Note: we could still have odier multiple bits on (TIMEOUT, 

* MEMERR, SI2XRR, SBERR), but we probably won't recover from 

* diis condition anyway, so it really doesn't matter. 

* 

* But if you really wanted to, know you'd do something like 
•dus: 

*/ 

/* more dian one of TIMEOUT, SBERR, MEMERR, or SIZERR */ 

(void) getserQ; /* make sure ifs clear */ 

if(on_£MiltO) 

newSER » getserQ; 
else { 
* register intx; 

newSERsO; 

X =: *enor addr, /* probe die address to see what happens */ 

} 

no_fauitO; 

/* use newSER to figure out what the problem was, if any */ 
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12. Synchronous Error Virtual Address Register 

The Synchronous Error Virtual A^iess Regiaer (SEVAR) occupies four bytes at locations 
(ASI=0x2, A31:28=0x6» A3:0=0x4 to 0x7). It contains the virtual address associated with the last synchro- 
nous bus error. It is not latched. 

Note that on errors resulting from aKhe'>fil] operations, the SEVAR will contain the address 
that the CPU presented to the cache chip that tri^O'ed the cache-fill operation. This may or may 
not be the address of the word that actually caused the error. 

The SEVAR has the following format: 

31 

V --.-V -.-.--.--V ,-.-----. V -.-v 

I Virtual Address (A31:0) I 



3.2 J. Asynchronous Error Register 

The Asynchronous Error Register (ASER) occupies four bytes at locations (ASI=0x2, A3 1:28=0x6, 
A3:0=0x8 to OxB). Reading any pcfftion of the r^ist^ also clears that p(xtic»L It has the following fc^mat: 

31 23 15 .7540 
V - V-- V---------- V-- V 

I undefined IWiOiTIDIO 01 

W ASE_WBACKERR 1 = Valid bit was zero in a page map entry 
T ASE_TIMEOUT 1 = Non-existent device was addressed 

D ASE.DVMAERR 1 = bus error during DVMA access 

The ASER latches (freezes) widi die cause of an asynchnxwus error, ignoring subsequent asynchro- 
nous emxsy until read and deated. It is also krtched when a syncfanHMHis memory error 
(SE_M£MERR) occurs, and shoold be read to onbtch it as part of SE_MEMERR processing. Note 
that bits in the SER are set when bits in the ASER are set; thus the SER should be read to clear it as 
part of asjmchronous error processing. 

A write-back error can occur on systems with write-back caches, and/or on systems that do buffered 
writes, when either the hardware malfunctions or the MMU mapping is changed without i»operiy flu^iing 
the cache. In additi(xi, certain devices (foe example, frame buffers) will generate write-l»ck errors under 
device-specific conditions when a stOK is attempted to them. 

A timeout is rep(Hted on access to a mm-existoit device, except that accesses to non-existoit physi- 
cal memory may produce detectable behavior other than timeouts. (See the section "Type Space," 
below.) FcM- SPARCstation-1, this can <xily ha|^n if the MMU is set up to map a non-existent device or if 
the hardware malfunctions. 

The specific cause of a DMVA bus error must be determined by polling the possible sources to see 
which indicated the error. AH possible souites of DVMA emvs of this type must be recognizable in some 
way. For SPARCstatioa-1, the only possfl}le source of DVMA bus orors is menK^ parity errors. These 
can be determined by exaninii^ the Memoy Error Register, described below. 

Du^ to a bug in the cache chip, the ASER is not always set when an asynchronous error occurs. 
In thb events the ASER can be reconstruct frosa the bits in the SER. SE^MEMERR should be on 
in the SER. In addition, SEJTIMEOUT indicates that ASE^TIMEOUT should have been reported, 
and SE_SBERR indicates that ASE_WBACKERR should have been reported. The address in the 
ASEVER is correct even when the ASER is not set This bug is in all versions of the hardware, 
including P1.7's, and is not expected to be fixed. 

32.4. Asynchronous Error Virtual Address Register 

The Asynchronous Error Virtual Address Register (ASEVAR) occupies four bytes at locations 
,aSI=0x2, A3 1:28=0x6, A3:0=0xC to OxF). It contains the (pseudo) virtual address associated with the 
asynchronous bus error described in the ASER. It is latched under the same conditions that the ASER is 
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idled. It is unlatched when it is read, not when the ASER is read. Thus, the ASEVAR should be read to 
unlatch it as part of SE.MEMERR piocessing. 

The ASEVAR has the following formaL* 

31 29 

V---- ---V - --..V---------------V - V 

IS SI Pseudo Virtual Address (A29:0) I 

S Bits 31:30 are copies of bit 29. 

The address is called a "pseudo-virtual" address because the hardware only carries the low-order 30 bits 
of the virtual address onto die bus. and a^umes that bits 31:29 are all die sanie. The ASEVAR reverses 
this process by copying bit 29 into bits 31 and 30 on asynchronous orors reported by the lU. Due to a bug 
in the cache chip, bits 31 and 30 are zero on DBMA asynchronous errors (ASE_DVMAERR is on). 
Software must do the sign extoision itself. 

Detennining the context register value associated with an asynchroiKMis error is usually straightfor- 
ward; thoe is only <xie tricky case. 

Since DVMA is always done using context 0. the address associated with a DVMA error will always 
be context 0. 

Non-DVMA asynchronous errors are due to bu£fer chip activity. The buffer chip allows only one 
outstanding store; a subsequent store will stall the CPU in the middle of executkxi of the seccmd suve undl 
die outssoiding stne comptetes. If it coaq)letes with an asynchronous ernx*. die errw will be rqxxted to die 
CPU immediately after execution of die second suxt ins&ucdon finishes. (This is not necessarily comple- 
Qoa of die second staz itself, is it may itself be buffered. This is just completion of die stne instnicticxi 
from the CFU^s pokaoivicw.) Unless the second stne is a write to die context roister, die address of die 
asynchronous eaot will be associmed with the vahie in the context roister (die cuimu context). 

If the second store does modify die context roister, tfien die address of the asynchronous error is 
associated widi die previous context, which must be determined by software. (If, for example, die first 
store W2S to a supervisor-only page, then the actual context is irrelevant as supervisor-only pages are 
mapped into all contexts.) 

Oie can constma patholc^cal cases where it would be impossible to determine diat an asynchro- 
nous error is asociated with die {xevious context (f(H' example, a suxe to a user page, followed by a 
branch, widi die store to die context raster in die delay slot (^ die branch). It is up to software to avdd 
tbest padioiogies. 

3.2^. Simultaneous errors 

It is possible fa* both a synchronous and an asynchronous error to be reported simultaneously. Con- 
sider the following case: 

st %g0, [%10]! this address causes an asynchronous timeout 

st %gp, [%11]! diis address causes a page fault 

Depending upon die alignnmtt of die instructions in die cache, it is possible for die lU to take die page 
fault trap (a synchronous enor) first, and while it is dissd)led for traps but before die SER has been read, die 
asynchronous fault can be reported. This will turn on die MEMERR bit in die SER, which can lead 
software to believe diat this is a synchronous memory enor. Since die MEMERR is really asynchrcmous, 
diere will be a level 15 imemipt pending. If software treats diis error as synchronous, and diligendy reads 
die SER, SEVAR, ASER, and ASEVAR to clear and/or unlatch diem, dien when traps are eventually 
enabled and die tevd 15 interrupt occurs software will discover diat there is no information in either the 
ASER or the SER pertaining to the asynchronous interrupt 

Software can avoid diis difficulty by comparing die ASEVAR to die SEVAR whenever MEMERR is 
set on a synchronous trap. If diey are identiod, then diis is a uue synchronous MEMERR. If they are dif- 
ferent, then the MEMERR is associated with the asynchronous trap. Software should clear the pending 
vel IS interrupt and process the asynchronous error, using die ASER and ASEVAR values, and being 
cognizant of the bugs in asynchronous error repenting described previously. The synchronous error can be 
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.nored, f(x it will recur if and when execution of the program is resumed. 

3.2.6. Serial Port 

The serial port is tefsenced by byte loads and stores at locations beginning at (ASI=0x2, 
A=OxF0O0OOO0). This access is provided so that the s»ial port may be used befc^e the MMU has been ini< 
tialized, for example by the PROM roonitcx*. Software ncxmally accesses the soial port via I/O space 
through the MMU. 

See the section "Serial P(»ts" under **Type 1 Space'*, below, for mort information on the serial 
port registers. 

4. Physcai Space 

The MMU maps virtual addresses in Device Space to physical addresses in Physical Space. I^ysical 
space is further subdivided into four types, as indicated in the folbwing tabte. 



Type Sun4Use SPARCstation-1 Use Comments 






MainMOTiory 


Same 


1 


I/O Space 


-Same 


2 


VMEbus, 16-bit data 


Unused 


3 


VMEbus, 32-bit data 


Unused 



Notcl 
Notel 



Note 1. In SPARCstation-1, referoices to type 2 or 3 sgac& cause a timeout 
The size of a jAysical address is 28 bits. 

4.1. Type Space 

Type space contains the main mem(»y (RAM) in SPARCstadon-1. Since PA27K} are used for 

RAM device decoding, the Sbus can suf^xxt a thetxetical maximum of 256 Mbytes of RAM. However, 

e SPARCstation-1 implementation only supports a maximum of 64 Mbytes. In addition, individual 

2iPARCstati(»i-l machines can be configured with as little as 4 Mbytes of memory. To explain what hs^ 

pens when non-existent RAM is addressed, the implementatkm must be explained and sxxne t»ms defined. 

The SPARCstation-1 memory subsystem oxitains two RAM oMitrolIers. Each RAM contrdler con- 
trols a **bank" of 32 Mbytes of address space. Each bank is made up of two "sets" spaiming 16 Mbytes 
each. Each set contains four SIMMs (Single Inluie M^nory Modules) each. Each SIMM consists of 9 
chips. Each chip is either a 1 Mbit (X- a 4 Mbit DRAM. All the chips in a SIMM are of die same type, and 
all the SIMMs in a set must be of the same type. A %t of 1 Mbit DRAMs contains 4 Mbytes of mem(xy, 
and a set (^ 4 Mbit DRAMs contains 16 Mbytes of menxxy. The SIMMs in <xic set can be of a different 
type than die SIMMs in another set, even in the same bank. 

The RAM C(»itrollas require PA27 to be zero. If PA27sl, then no oxitroller resp(Mids and a bus 
timeout occurs. 

PA26:25 selects the appropriate RAM controHo-. One axitroUer responds to 0x0, the other re^x>nds 
toOxl. If PA26:25=2 or 3, dien no controller responds and a bus timecHit occurs. 

PA24 selects one of the two sets of SIMMs controlled by a controller. If the selected set is not 
installed (a hole), then on writes die data is thrown away and on reads the bus lines remain high (subject to 
noise) and a char»:teristic bit pattern (normally all ones) is returned. Software can detect a hole by doing a 
stc»e to fpUowed by a load torn a byte on 16 Mbyte boundary. If the data read does not agree with the 
data written, then a hole exists. If they ^ree, the same test with a differ^t bit patten should be used 
befcne concluding that real memory exists. (Note that parity checking shoukl be disabled when doing these 
checks, as parity errors will be reported if the noise pattern contains bad parity and parity checking is 
enabled.) 

If the selected set consists of 4 Mbit DRAMs, then all 16 Mbytes of address space ^)anned by that 
set are valid and correspond to unique memory locations. If the selected set ccxisists of 1 Mbit DRAMs, 
then only 4 Mbytes of unique mem<xy exist, but it app^u^ four times in the 16 Mbytes of address space 
anned by the set, repeating at every 4 Mbyte boundary. This "mirror" behaviw can be detected by 
^ftware by doing a store of one bit pattern to offset of a set, followed by a store of another bit pattern to 
offset 4 Meg (0x00400000) of the set, followed by a load from offset 0. If the data at offset was changed 
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/ the store to ofEset 4 M^, then only 4 Mbytes of mem(My is present and the rest is filled with mirrors. 
The following decision table summarizes this behavior. 



PA27 


PA26 


SIMM Set 


PA23:22 


Action 


1 


- 


- 


- 


Tmieout 





1 


- 


- 


Timeout 








none 


- 


Hole 








4MbU 


- 


Memory (16 Mbytes worth) 








1Mbit 


00 


Memory (4 Mbytes worth) 








1Mbit 


01 


Mirror 


10 


11 



4^ Type 1 Space 

Type 1 space contains all (tf the I/O devices, iiKluding those that are associated with the Sbus. Bit 
PA27 is used to indicate an onboard device (PA27sO) or an Sbus device (PA27sl). For onboard devices, 
PA26:24 (and in some cases PA26:20) detumine the particular devkt. For Sbus devices, PA26:2S select 
one of four Sbus slots. The (physical or logical) board plugged into the Sbus slot thai has an address space 
of 25 bits, or 32 Mbytes, to divide up as it sees fit Sbus addressing is further described in the Section 
"SlMis Devices", below. Rv comp^bility with the Sun4 architecture convnttions, the non-existent bits 
(PA31:28) are assumed to be all ones. The following table describes the layout of Type 1 space: 



Address 


SPARCstation-1 Use 


Comments 


OxFOOOOOOO 


Keyboard/Mouse 


Notel 


OxFlOOOOOO 


Serial Pofts 


Notel 


OxF2000000 


TOD Clock and NVRAM 


Notc2 


)xF3000000 


Counter-Timer Registers 


Nbte3 


OxF4000000 


Memory Error Registers 


Notel 


OxFSOOOOOO 


Interrupt Register 


Notel 


0xF6000000 


EPROM 


Nbte3 


OxFTOOOOOO 


EPD "Private": 


Note 4 


OxFZlOOOOO 


ECC registers 


(HPD only) 


OxF7200000 


Floppy Controller 




OxF7201000 


Audio/ISDN 




OxF7400003 


Auxiliary Input/Output Register 




OxFTFOOOOO 


VME Control Register 


(SunFed only) 


OxFSOOOOOO 


Sbus Slot 0(25 bits) 


Nbte4 


OxF9000000 


n 




OxFAOOOOOO 


Sbus Slot 1(25 bits) 


Note 4 


OxFBOOOOOO 


a 




OxFCOOOOOO 


Sbus Skx 2 (25 bits) 


N6te4 


OxFDOOOOOO 


a 




OxFEOOOOOO 


Sbus Slot 3 (25 bits) 


Notc4 


OxFFOOOOOO 


m 





iOO'i 



-fife 6"^ fff 1^00 



O 



Note 1. ^one as Sun4 use. 

Notel Sun4 has a different IdndofTOD at this address. It also has an EEPROM at a different address. 

Note 3. Sun4 has same function, but at a differau address. 

Note 4. Sun4 has no correspcMiding func^icxi. 

Reference to a Type 1 address to which no devke responds results in a timeouL 
4.2.1. Onboard Devices 
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2.1.1. Keyboard/Mouse 

The keybo^d/nwuse UART is a Z8530 chip (Ziiog or AMD equivalent) accessed via byte loads and 
stores at the following ackiresses: 



Address 


Deasipdon 


OxFOOOOOOO 


Mouse Contrd Pwt 


0xF0000002 


Mouse Transmit (W)/Reccivc (R) Data Port 


OxF0000004 


Keyboard Control Port 


0xF0000006 


Keyboard Transmit (W)/Reccivc (R) Data Port 



The Z8530 contains an array of read registns and write registers, accessed through the control ^on. 
Access to a register is done by writing the registo* index to die control port, and then reading (x writing die 
register data to die control pent. In additi<Mi, die UART transmit and receive data registers may be direcdy 
accessed by writing and reading, respectively, from the Transmit/Receive Data Port 

See die Z8530 data sheet for mcxt infOTmation. 



4.2.1.2. Serial Ports 

Hie serial ^octs UART is also a Z8530 chip, identical tothtoac used f(x the keyboard/mouse. It is 
addressed as follows: 



Address 



DescriptJcm 



OxFlOeXXXX) Serial Port B Control Port 

0xF1000002 Serial Port B Transmit (WyReccivc (R) Data Port 

0xF1000004 Serial Pwt A Control Port 

0xF1000006 Serial Port A Transmit (W)/Receive (R) Data Port 



2.1 J. TOD Clock and NVRAM (EEPROM) 

The Time of Day Ckx:k is a Mostek MK48T12-15 Zeropowfx/Tlmekeeper RAM which includes 2K 
of RAM, die toi»nost 8 bytes of which are die dock. The Tuasioeepet OMitains its own battery bacloq), 
which has a wcxst^ase storage life (oscillatcr off or power on) of 11 years at 70°C and a -worst case con- 
sumption life (oscillator on and power off) of 2.8 years at (fC. Unlike EEI^OMs, there is no limitation on 
die number of times die CMOS RAM can be writtm, ncH" are ^)ecial write timings required. 

The Qock/NVRAM is accessed via byte, halfwcxd, or fullw(»d loads and stores at die following 
addresses: 



Address 


Description 


OxF2000000to 


NVRAM 


0xF20007d7 




0xF20007d8to 


"IDPROM" 


0xF20007n 




OxKMOOTfS 


TOD Control 


OxF20007f9 


Seconds (00-59) 


0xF20007fa 


Minutes (00-59) 


0xF20007fb 


Hour (00-23) 


OxF20007fc 
0xF200O7fd 


Day (01-07) 
Date (01-31) 


0xF20007fe 


Mondi (01-12) 


0xF20007ff 


Year (00-99) 



Thirty-two bytes of NVRAM acts as the ID prom" of SPARCstation-1. The id_machine byte con- 
tains 0x51; 0x50 is the architecture code for Sun4C, and 0x51 indicates die SPARCstation-1 machine. 

The TOD Control register should only be written with byte stores to prevent modifying the data to be 
Jiead. 

The time and date information is stored in 24 hour BCD formaL For more infonnation, including the 
protocol to be used to read, write, start, and stop the clock, see the MK48T12-15 data sheet 
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2.1.4. Counter-Timer Registers 

The Counter-Timer Registers are accessed via fu]lw(xd loads and stores at the following addresses: 



Address 


Description 


OxF3000000 


Counter 


0xF30000O4 


Limit 


OxF3000008 


Counter 1 


OxF300000C 


Limit 1 



All registers have the following format: 
31 9 

V--'- V-- --V -.-V V 

ILI 21-bit value 10 01 

L Limit Reached 

Each countm- is incrmienied by one in bit position 10 at one microsecond intervals. Wlwn a couiuer 
reaches the value in its ccHre^xxiding limit r^;ister, it is reset to "one microsecond," the limit-reached bit 
in both the counter and limit registers is set, and an intemipc is generated j(lf enabled) at tevel 10 for 
Counter and level 14 for Counter L 

The intemipt is cleared and the limit bits reset by reading die appropnaie limit register. Reading the 
counter Tqpsset does not change the st^e of the lunit bits. Writing the limit register resets the coumer 
register to a value equivalent to one microseccmd. Except for testing purposes, the counter roisters should 
not be written. 

Settii^ a limit register to zero causes the cocreqponding couitter to fireerun. IntemqHs will occur 
-^iien the countn* overflows back to zero, a^^xoximately every 2 seconds. 

4.2.L5. MeniOTy Error RegKters 

SPARCsutdon-l uses a sii^ Parity Controi R^ist^. This is a fullw(^ read/write register at loca- 
tion 0xF400000O in Type I physical space The f<mnat d this register is as follows: 

31 23 15 7 
V V V V V 

10 0000000000000000000000 OIEIMITINIAIBICIDI 

E Parity Error. Set on any parity emv. 

M Multiple Emvs. Set when a parity error occurs and E=l. 

T Parity Test When set, inverse parity is generated. 

N Parity Check. Envies parity chrcking. 

A Parity Error 24. Recotds purity error on data bits 31:24. 

B Pari^ Error 16. Records parity error on data bits 23:16. 

C Parity Error (%. Records parity error on data bits 15:8. 

D Parity Error 00. Records parity error on data bits 7.-0. 

The bits that indicate errors (E, M, and A-D) are cleared when the register is read. All bits are cleared on 
reset ' 

Note that when a parity error occurs, the cache will have loaded itself with the d^a from memory 
anyway. This means that softw^e must flush the cache after parity errors if it is to continue operation. On 
a single parity error (M=0), only the affected cache line (as determined from the old PC, the SEVAR, ex 
the ASEVAR, as appropriate) need be flushed. On multiple parity errors (M^l), the entire cache must be 
flushed. 

Also note that the address in the SEVAR or ASEVAR, as appropriate, may not be the address of the 
ird with the parity error, if the error occurred during a cache-fill operation. 
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11.6. Interrupt Register 

The Intsnipt Register is a one-byte read/write register at location OxFSOOOOOO in Tjpe 1 physical 
space. The format of this register is as followed: 

7 

IaioicidieifigihI W §- e-^Jo^ Xl/^*^ i^ 

A Enable Level 14 Interrupts cJk^ OS^-lrQ^ \ r , ^ i\ 

C Enable Level 10 Intem^Hs cxW_ fj>— ^"^ ^ \ ,AjWi~ oL^K-vo'^--^?^ J 

D Enable Level 8 Interrupts \/'\^<Suo , -^ \ 

E Software Interrupt Levd 6 (m>V" AJU:£-o-»cor^^ c ) 

F Software Interrupt Levd 4 J -^ 

G Software Interrupt Level 1 

H Enable all Intorupts 

Writing a z^o to an Enable Level N Interrupt bit only masks out that interrupt, it does not clear the source. 
Writing a one to a software interrupt bit requests an interrupt (»i that level; the l»t must be cleared to clear 
the request 

Wridng a zero to the Enable All Interrupts bit will clear the Asynchroncus Memcny (level 15) Int»- 
rupt, as well as masking all interrupts. Of course, interrupts should be immediately re-enabled by writing a 
one. 

On reset, all bits are cteared and all interrupts are reset 

4.2.1.7. EPROM 

SPARCstation-1 has 128K bytes of EI^OM containing the boot monitor beginning at location 
F60000(X) in Type 1 physical ^)ace. The EPROM is also referenced by all Supervisor Virtual addresses 
A'hoi the ENA.NOTBOOT bit in the System Ensdile Register is zero, iox example at boot time. The boot 
code must initialize the MMU to at least map itself before setting the ENA.NOTBOOT bit to one. 

Note that the EPROM does not obey the mxmai memory mapping rules. PA[16:0] into the EPROM 
always come from VA[16:0]. Although VA[29:12] are processed by the MMU to select a physical 
address, when bits PA[27:24] of tiiat physical address select the EPROM thra bits PA[23:12] from the 
MMU are ignored. This means that, fw proper operation of the EPROM, it must be mapped one-for-one to 
contiguous virtual pages beginning on a 128K boundary. 

4.2.1.8. Floppy Controller 

The Floppy Disk Ccxitroll^ is an Intel 82072. It is accessed using byte k>ads and stores at the fol- 
lowing addresses: 



Address Description 



0xF7200000 Main Statiis(R)/Data Rate Selea Register (W) 
0xF7200001 FIFO Data Port (R/W) ' 



For m<xe infnmatkn see the bitd 82072 data sheet Note that the flo{^y must be selected as drive 1 
(or 3, but 1 is {veferred) in the command sequence sent to the controller. See also the Terminal C^nt and 
Floi^y ^ject bits in the "Auxiliary Input/Output Register" described below. 

4.2.1.9. Audio/ISDN 

The audio interface of the SPARCstation-1 is provided through die Main Audio Processor (MAP) of 
the AMD 79C30A Digital Subscriber Controller. The 79C30A is a highly integrated circuit which pro- 
vides an ISDN 4-wire subscriber level interface, an audio processing circuit a parallel microprocessor 
interface, and a serial interface. For SP ARCstation-l Audio use the microprDc^sor interface and the audio 
'ocessing circuits are die only pcxiions of the circuit which are used. 
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The interrupt &om tiie 79C30 is attached to IRQ<13> of the MMU (which is interrupt level 13). The 
data bus is connected to the 10 data bus. The circuit includes an oscillator circuit which uses an externally 
provided 12.288 MHz crystal widi a tc^exance of •<- or- 80 i^xn. The oscillator is a paiallel resonant circuit 

The 79C30 registers are located at a base address of OxF7201000. The 79C30 is accessed using byte 
loads and stcxes at the fc^owing addresses: 



Address 


WR* 


RD* 


Register description 


0xF7201000 





1 


Command Register (CR), write only 




1 





Interrupt Register (IR), read only 


OxF7201001 





1 


Data Register (DR). write 




1 





Data Register (DR), read 


0xF7201002 


1 





D-channel Status Registn 1 (DSRl). read only 


0x1-7201003 


1 





D-channel Error Register (DER), read only 


0xF7201004 





1 


D-channel Transmit Buffer (DCl'B), write only (8-byte hll-O) 


0x1-7201004 


1 





D-channel Receive Buffer (DCRB), read only (8-byte l-UO) 


OxF7201005 





1 


Bb channel Transmit Buffer (BBTB). write only 


OxF72O10O5 


1 





Bb channel Receive Buffer (BBRB), read only 


0x1-7201006 





1 


Be channel Transmit Buffer (BBTB), write only 


OxF7201006 


1 





Be channel Receive Buffer (BBRB), read only 


0xF7201007 


1 





D-channd Status Register 2 (DSR2), read odly 



Note that die other roisters in the 79C30. of which diere are many, are indirectly accessed through 
the command register. Pages 2-71 through 2-77 a[ the 79C30A Data Sheet describe diis indirect address- 
ing. 

Hease refer to die 79(I30A Data Sheet for full details on operation of this circuit 

^2.1.10. Auxfliary Input/Output Register 

The Auxiliary Inpu^Output Register is a one-byte, re»l-write register at location OxFJAOOOOQ in 
Type 1 physical ^tact. It has the foQowing format: 

7 6 5 4 3 2 10 



II IIDICISITIEILI 



D 


In 


Density 


C 


In 


Floppy Diskette Change (must be written as one) 


s 


(Xit 


Floppy Drive Select 


T 


Out 


TC (Floppy controUer Terminal Cotnt mput) 


E 


Out 


Floppy Eject 


L 


Out 


LED(l=on.O=off) 



All bits are set to one on reset 

Bit 5 (Density) b a signal firom the drive indicating the density (rf the diskette inserted. A 1 indk:ates 
high den^ty, a indicaies low density. This signal is meaningful only if the floppy drive is capable of 
sensing the ''density** bole in the diskette. Hie Sony drives do not generate this signal; for them, software 
must throu^ trial and error determine die densty of the inserted diskette. This can be done by initializing 
the contrdler widi {^rameters for a given density and attempting to read die diskette; if die wrong parame- 
ters were chosen read errors will occir. Note that the density of an unformatted flqppy cannot be (teter- 
mined through this method; the floppy format software must have a user option to set the density to be 
used. (If die user selects die wrong density, die floppy will be unusable, but die user will quickly discover 
this mistake.) 

Bit 4 (FIo{^y Distette Change) is an input bit diat signifies when a diskette has been removed from 

die drive. This bit must always be written as one in (xder for it to work. It reads as one when tte drive is 

iccted and diere is no diskette in die drive. It reads as zero if the drive is not selected or if a diskette is 

^icsent in the drive. The Sony drives reset die bit when they receive a step pulse from Uie controller, i.e., 

when die software issues a "Seek" command. Other vendor drives require a separate Diskette Change 
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*set signal; a bit will need tc be ;»ovided for this fonction in the Auxiliary InputADutput Register if a 
non-Sooy drive is chosen for SPARCstaoon. When will this decision be made? 

Bit 3 (Fl<^y Drive Select) is connected to the Qoppy drive select pin. It is used in conjunction with 
all fioppy q)era£icRS, whether through the Flq;py Disk GHitroiler registers or the bits in the Auxiliary I/O 
Register. A cme sela:ts the flq^y drive; a zero de°selects it 

Bit 2 (TQ is connected to the Terminal Count input pin of the floppy oxitroller. It is us«l to signal 
the floppy c(xitn}ll« (which is designed to be connected to a DMA contiolkr. even though in 
SPARCstatkxi-l it is not) that all the data for a given operaticm has been transfened. This is done by writ- 
inga 1 to this Ut, delaying fn* a specific amount of time, and then writing a to it (The specific amount of 
timedepoids vipm the d^a rate and can be found in the Intel 82072 data sheet) 

Bit 1 (Ji'ktppy Eject) is a»nMCted to the floppy drive eject mechanism. To eject a floppy, set bit 3 
(Floppy Drive Se^), wait 2.0 microseconds, set bit 1, hold it set tor at lea^ 2.0 microseconds, then reset 
bodi it and bit 3 to zero. 

Bit (LED) controls the LED on the fix>nt panel. 

Unused bit positions should be written with ones when writing to the register. This will allow them 
to be used for input signals if this becomes necessary. 

42JL Sbus Devices 

Unlike previous busses, the Sbus is geographically addressed. PA26:25 select which of four Sbus 
"slots" is b^g referenced. A board plugged into an Sbus slot has PA24:0. or 25 bits or 32 Mbytes of 
address space addressability to divide up among the devices contained on that board. A Forth program 
beginning at ofEs^ of the slot describes the devices on that board to the system. The details of the Forth 
qiedfication are described in Sun Fcxth User's Guide. 

Slot is not a physical slot. Rather, it refers to the onboard DMA, SCSI, and Ethonet controllers 
.lich, f(»- convenience, are viewed as being plugged into Slot 0. 

Slots 1, 2, and 3 are physical slots into which the user may plug boards omtaining devices. Slots 1 
and 2 have DVMA-master capability; sloe 3 is a slave-only slot and does not su^^xxt boards th^ operate as 
DVMA masters. The boaid omtaining the video subsystem (video control registers, RAMD AC, and firame 
bufGn^) is usually, but need not be, plugged into Slot 3. 

If no device responds to a particular Sbus acklress, a bus timeout will occur. 

The fdlowing table summarizes the devices: 



PA26:25 


Device 


00 


Oiboard DMA, SCSI, and Ethernet controllers 


01 


SbusSkxl 


10 


SbusSkx2 


11 


Sbus Slot 3 (usually video subsystem) 



422,1. DMA, SCSI, and Etfaeroet Devices 

The following table describes the ofEisets to the onboard DMA, SC^I, and Ethernet devices, relative 
to the beginning of Sbus '*SkM 0" (base physical address OxFSOOOOOO in Type 1 space). 



Offset DesCTiption 



0x000000 ID (4 bytes, OxFESlOlOl) 

0x400000 DMA Registers 

0x800000 SCSI Registers 

OxCOOOOO Ethernet Registers 



4.2.2,1.L DMA Registers 

The DMA registers are accessed via fuilword loads and stores to the following offsets (the addresses 
in this table do not include the slot base address, which must be added to the device offset): 
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Address Description 



0x400000 DMA Controi/Status Register 

0x400004 DMA Address Register 

0x400008 DMA Byte Count 

Ox40000C Diagnostic Register 



The DMA registers are used when programming SCSI operations. Other than the DLACC bit in the 
DMA Control/Status Register, they are not used when programming Ethernet operations. 

4.2^1.1.1. DMA Control/Statiis Register 

The DMA Control/Status Register has the following fonnaL* 
31 27 15 12 10 8 6 4 3 1 

V - V -V - V------- V 

IDEV_ID I- unused (read as zero) ILITICIADRIPINIWIRIDIFI I IPCKIEI J I 

DEV_ID 

DEV.ID. Device ID. Read-only. (OblOOO in this implementation.) 

L L. HACC. When 0, the Ethernet/DMA interface is configured to use the Lance Ethernet controller. 
When 1, the interface is configured to use EACC. "the new Ethem^ chip from AMDpq (Cliff 
Buckley). 

T TC. Terminal Count. Read-only. Byte counter has expired. This bit is cleared by setting the Flush 
bit (bit 5). 

C EN.CNT. Enable Count. ReadMrite. Enables the DMA Byte Count Register. (Not used in normal 
SPARCstation-1 operation.) 

DR B YTE.ADDR. Read-only. Next byte number to be accessed. 

P REQ.PEND. Request pending. Read-only. Set when the DMA interface is active. RESET and 
FLUSH must not be asserted if REQ_PEND is one. 

N EN_DMA. Enable DMA. ReadMite. Set to enable DMA activity, reset to disable. 

W WRITE. Read/write. Set for DMA from device to memory (read), reset for DMA from memory to 
device (write). 

R RESET. Read/write. When set, acts as a hardware reset ERR_PEND, PACK CNT, INT EN, 
FLUSH, DRAIN, WRITE, EN.DMA, RE<i,PEND, EN.CNT, and TC are aU setlo zero. RESET 
remains at 1, ani roust be set back to by software to resume oper^on. 

D DRAIN. ReadAvriie. Set to f(xce remaining pack register bytes to be drained to memory. Gears 
itself. 

F FLUSH. Write-only. Set to force PACK_CNT and ERR_PEND to zero. Also clears TC and the 
interrupt TCsl causes. Always reads as zero. 

I INT.EN. Interrupt en^le. ReadA«nrite. Set to enable interrupts. 

PCK PACK^CNT. Pack Count Read-only. Number of bytes in Pack Register. 

E ERR.rcND. Error Pending. Read-only. Set when a memory excejxion occurs. Re^ by setting 
FLUSH. DMA activity stops until reset 

J INT_PEND. Interrupt Peiuling. Read-<xily. Set when TC=1 or when external device raises an inter- 
rupt Cleared when read (if TC=1 is the c<uise) or by servicing the external device (if that is the 
cause). 

s 

422,1.12. DMA Address Register 

The DMA Address Register has the following format: 
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IVA31:24-latchedl 



VA23:0 - address 



The high byte is latched by the hardware and indicates which 16 Mbyte r^<Mi of Virtual Memory is 
accessed. (The MMU recognizes a DMA virtual address and forces Context to be selected) The low- 
order 3 bytes contain the address of die byte to be transfened. Rollover is only dirough the bw-<Hder 24 
bits. 

4.2^1.1J. DMA Byte Count 

The DMA Byte Count Register has the following format: 

31 23 
V V V-- V 

!0 01 BCNT23:0 - counter 

This register is only used vfhea EN.CNT is on in the DMA Oxittoi/Status Regist^, and so is not 
used in normal SPARCstatkMi-1 apcxT^^xm. The high byte is unused and will always read back as zoo. 
The low order bytes contain the number of bytes to be transferred, and coimts down to zero. When ^ro is 
reached, TC, and dius INT.PEND, are set to (»e. Further DMA transfers cannot take place until a new 
value is loaded into the Byte Count Regiser. 

4.2JL1.1.4. Diagnostic Register 

The fcnnat of the Diagnostic Register is not available. 



12,12, SCSI Registers 

The SCSI regions are accessed via byte loads and stcKes to the following offsets (the addresses in 
this t^Ie do not include the sk>t base address, which muss, be added to the device offset): 



Address 


Descripd<xi 


0x800000 


Transfer Count Low 


0x800004 


Transfer Count High 


0x800008 


FIFO Data 


0x80000C 


Command 


0x800010 


Status/Bus ID 


0x800014 


Intnrupt/Status Tmieout 


0x800018 


Sequential step/Synchronization transfer period 


Ox80001C 


bUru dags^ynchronizaaoa otfset 


0x800020 


Configun^oo 


0x800024 


Ckxrk Conversion Factor (write only) 


0x800028 


ESP TEST (chip test use oiily) 


0x80002C 


ESP H C:dnfiguradon-2 



Note diat byte accesses must be performed even though the addresses are all fullword-aligned. 

Since the SCSI controller uses the DMA contrdler to perfomi the aaual transfer of data to and from 
memory, the two devices must be programmed together. One possible algorithm is as follows: 



scsi_startO 

{ 



/* start an operation on the SCSI */ 

lock data pages into contiguous virtual memory; 

DMA_address_register = starting virtual address; 

setup SCSI registers (except for "go"); 

DMA_control_sJatus_register = (EN_DMA ! INT_EN I (other bits)); 

start SCSI; 
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} 



I* The SCSI will interrupt us when it is done. */ 



scsi.interruptO 



{ 



} 



/* must drain DMA on a read finxn diskAvrite to memcxy */ 
if Oast operation ss READ) { 

DMA_conirol_status_register = (DRAIN); 

} 



For a detailed description of the SCSI registers, see the NCR S3C90 Data Sheet. 

A22S3, Ethernet Registers 

The Ethernet registers are accessed via halfword loads and stores to the following offsets (the 
addresses in this table do not include the slot base address, which must be added to the device offset): 



A(kiiess 



Description 



OxCOOOOO Register Data Port (RDP) 
0xC00002 Register Address Pdrt (RAP) 



Fcx- a detailed descriiHion of the Ethernet n^isters, see the AMD Am7990 Data SheeL 



4JL2JL Video SubsTStem 

The following table describes the offs^s to the devices kx:ated on the Video Subsy^em Board. This 
board is usually plugged into Sbus "Slot 3" (base physical address OxFEOOCKKX) in Type 1 space). 



Of&et 



Description 



•JxOOOOOO ID(4bytes,0xFE010101) 
Ox4(X)000 Video and DAC Registers 
OxSCXXXX) Frame Buffer 



A22:i,l, Video and DAC Registers 

The Video and DAC registers are accessed via byte loads and stores to the following offsets (the 
addresses in this table do not include tiw slot base address, which must be added lo the device offset): 



Address 


Description 


0x400000 


Video Control Register 


0x400001 


Video Status Register 


0x400002 


HBS (Horizontal Blank Set) 


0x400003 


HBC (Horizootai Blank Gear) 


0x400004 


HSS (Horizontal Sync Set) 


0x400005 


HSCO (Horizonial Sync Oete, ! VS) 


0x400006 


HSCl (Horizontal Sync Clear, VS) 


0x400007 


VBSH (Vertical Blank Set High Byte) 


0x400008 


VBSL (Vertical Blank Set Low Byte) 


Ox4O0069 


VBC (Vertical Blank Dear) 


0x40000A 


VSS (Vertical Sync Start) 


0x40000B 


VSC (Vertical Sync (3ear) 


0x400010 


DAC Address Register 


0x400014 


DAC Color Palette Register Port 


0x400018 


DAC Control Register Port 


0x40001C 


DAC Overlay Palette Register Port 



See the S-4 Video data sheet for a detailed description of the Video Registers, and the Brooktree 
^1458/451 data sheet for a detailed description of the DAC Registers. Note that setting incorrect values 
into the registers can damage the attached momior. 
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Note that the DAC regist^^ are S°bits vvide even though they are aligned on fuiiword boundaries. 
Fuilw(»d accesses can be used to quicidy read or write one or nu^e palate entri^ by stcxing the index of 
the first palette to be accessed in the address register and then doing fuiiword accesses to the appropriate 
palette port The data must be packed into bytes in the order "RGBRGBRGBRGB"; in other words, 3 
fuUwords will hold 4 palette entries. Paktte entries are cmly stored when the Blue value is written; partial 
update of a palette is not possible. 

422,12, Frame Buffer 

The fiame buffer is a megabyte of RAM occupying offsets from 0x800000 to OxSFFFFF. Each byte 
corespcxids to one pixel Access^ may be by bytes, by halfwords, os by fiillwcHds. 

If the frame buffer is only half-p<^)ulated, then (Mily the lower four bits of each byte will be 
significant As the iq^per four bits will be (weakly) pulled up with resistcH^ only the upper 16 color map 
entries (entries 240 through 255) in die DAC will be usable. Software can detea this case by writing, then 
reading, the frame buffer. If the upper four bits always read back as ones, independent of the data written, 
then the frame buffer is half-populated. (Thisisgrody — Ed.) 

S. Interrupt Levels 

The following table describes the interrupt levels defined by the Sjm4 Architecture and the 
SPARCstation-1 implementation. 



Level 


Sun4Use 


SPARCstation-1 Use 


15 


Memory Errar 


Asynchronous Memory Error 


14 


Qock 


Counter 1 


13 


VMEbusIevel? 


Audio 


12 


Keyboard, Mouse, Serial Ports 


Same 


11 


VMEbusIevel 6 


Floppy 


10 


Qock 


Counter 


9 


VMEbusIevel 5 


SbusIRQ? 


8 


Video 


SbusIRC^ 


7 


VMEbusIevel 4 


Video, Sbus IRQ5 


6 


Ethernet, Software request 6 


Software request 6 


5 


VMEbus level 3 


Ethernet, Sbus IRQ4 


4 


SCSI, Software request 4 


Software request 4 


3 


VMEbus level 2 


SCSI, DMA, Sbus IRQ3 


2 


VMEbus level 1 


Sbus IRQ2 


1 


Software request 1 


Same, plus Sbus ERQl 



6. Resets 

Although there is only (xie type of reset in SPARCstadon-1 (a reset of the entire machine that causes 
system registers to be restcxed to a known state), there are ttoee ways to effea a reset: 

(1) Pbwer-on. A poww-on reset (POR) occurs when power is initially applied to SPARCstation-1. 

(2) Watchdog. A watchdog reset occurs when the lU signals an error condition. This can occur, for 
example, if the lU attempts to take a trap when traps are disabled. 

(3) Software. Software can initiate a reset by writing a one to the ENA_RESET bit of the System 
Enable Register. 

The SE_WATCHDOG bit in the Synchronous Error Register is set to one on watchdog-initiated 
resets, and set to zero for all other resets. 

7. Contexts, Caching, and the MMU 

This section describes the interaction of the context register, the cache, and the MMU from the 
■ogrammers perspective. 
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1. Context Register (ASIs2, AsOx30000000, byte access only) 

The OHitext Register has the following format: 

7 3 
V V 

10 01 CID I 

Note that although the CID is four bits wide, only the low-order 3 bits (CID2:0) are actually used. CID3 is 
ignored. 

The context register selects one (tf 8 contexts for translating User Mod e addresses. I t exists in both 
the Cache and the MMU. " 

Programming note: A byte store (STBA) into (ASI=2, A31:28s0x3) writes both the MMU and 
Cache Context Registers. A byte load (LDUBA, LDSBA) firom (ASI=2, A31:28=0x3, A0=0) reads the 
MMU's Context Register, and a byte load fipom (ASI-2, A3 1:28=0x3. A0=1) reads the Cache's Context 
Register. The ability to read each regjster separately is provided for diagnostic purposes; they should 
always contain the same value and standard software v^ u^ially just read the MMU*s Context Register. 

7.2. MMU decoding of Virtual Addresses 

From the MMU*s standpoint, a virtual address has the following format: 
31 29 17 11 

V v -----v V V 

I I I page in i I 

I 1 segment (12 bits) I segment Ibyte in page (12 bits) i 

11 I (6 bits) I I 

3te: VA3 1:29 must all be the same (all or all 1). An SE_INVALID error results odienvise. 

CID2.*0 is concatenated with VA29:18 to select one of 32K segment map entries. (Oie can view the 
segment map ^ consisting of 8 contexts, each ccxitext containing 4K segments.) The segment map entry is 
8 bits wide. althcMigh only the lower 7 bits are used, and points to a Page Map Entry Group (PMEG): 

7 6 
V V 

1 1 PMEG I 

PM£(j6.-0 is concatenated with VA17:12 to select one of 8K Page Map Entries (PME). (One can 
view the page map as consisting of 128 PMEGs, each PMEG containing 64 pages.) The PME is 32 bits 
wide, (»g2uuzed as follows: 

31 29 27 25 23 15 

V -.--.--V---- V------ -V V 

IVIWISIXITYPIAIMIO 01 physical page number (16 bits)l 

V l=entry is valid 

W iFwnte access allowed 

S l^Supervisor mode acce^ only 

X l=d(Mi't cache this page 

TYP 0=Main Memory; l=Sbus and I/O space; 23=ieserved for VMEbus 

A I=:page has been accessed 

M l=page had been modified 

PME15:0 is concatenated with VA11:0 to form a 28-bii physical address whose interpretation 
depends upon the type field. 

Programming Notes: 
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) A i«ige is 4K bytes. A se^nent is 64 pages or 256K bytes. A context ccmtains 4K segments or IG 
byte. This last is divided into two ackiress ranges of 512M bytes each, from OxOOOOOOOCM)xifffffff 
and from OxeOOOOOOO-Oxfffiffff. 

(2) Unlike ajrchitn:tu7^ used by other vendors, in this architecture there is no way to explkitiy mark a 
segment as invalid. However, the opezating system can leserve one PMEG and mark all of its PMEs 
invalid, and then point invalid segmoits at this PMEG. SunOS has traditionally used the last PMEG 
for this purpose, but this may be siU)ject to change. 

-^ (3) ^Because the cache ignores the context registg when resolving accesses to supervisOT-mode-nnly 
-^ pages, t he kemei segments shouM be idCTtical in each context This can be accomplished by 

repeating the same PMEG in the appropnasc segment map entries. 

(4) A context is selected by performing a byte stcxe into the Context Register ( ASI=2, A3 1 :28sOx3). 

A s^ment map is initialized by selecting a context, and then performing byte stores into (ASI=3, 
A29: 18=0x0 to Oxfff). (Half and fullw<Hd sttnes will work but are not recommended) 

A PMEG is initialized by selecting a context, and then performing fullword sxoks into (ASI=4. 
A29:18=desired segment, A17: 12=0x0 to Ox3f). 

(5) The hardware does not insure consistency betwera the cache and the MMU. The operating system 
software must flush the c^dic apfXOiHiately bef(xe updoing the NO^. Before changing the 
mailing of a context, a Flush Cache (Context) operation must be performed. Before changing the 
mapping of segment, a Flush Cache (Segment) operation must be performed. Before changing the 
mapping of a page, a Flush Cadie (Page) operaticxi must be perfom^. These operations are 
described in the C^he section, below. Also note that these are not the only circumstances when 
flushing the cache is necessary. 

7 J. Cache decoding of Virtual Addresses 

To imi^ove performance, SPARCstation-1 attains a 64K byte virtual address cache, consisting of 
K lines (tf 16 bytes each. The cache is one-way set associative, with each virtual address mapping to one 
and only one possible cache line. There is a4 byte cache tag associ^ed with each data line. 

From the Cache's stanc^int, a virtual jujdress has tlw following f^mat: 

31 29 15 3 

v v V v V 

I I I Ibyte ofl 

I I cache tag id (14 bits) I cache line (12 bits) I line I 

II I I (4 b.) I 

Note: VA3 1:29 must all be the same (all or all 1). An SE_INVALID error occurs otherwise. 

VA15:4 selects one of 4K cache lines. If the cache tag id matches (and, for non-supervisor-mode- 
only pages, the context ID), then a cache hit occurs. VA3:2 selects the desired wcrd frxxn the cache line. 

A c^:he tag has the following format: 



31 24 21 



10 01 CID IWIS 



18 15 10 
- V V ' 

VIO 01 cache tag id (14 bits) 10 



CID Cache Tag Context (copied from Cache Context Register when cache line is filled.) Note that only 
CID2:0 are present 

W l=write access allowed (copied from MMU when cache line is filled.) 

S l=Supervisor mode access only (copied from MMU when cache line is filled.) 

V l=entry is valid 

Programming Notes: 
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) The cache tags must be initialized by software before the cache is enabled, by clearing the valid bit 
in the cache tag of e£K:h cache line. It is sufficient lo do fullword stores of zero into (ASI=:2, 
A31:28==0x8, A15:4=03tO to Oxfff). 

(2) To flush all references to a context from the cache, a Flush Cache (C(xiiext) (^ration must be 
pcdonaed by selecting the ai^xopriate context (by peifcxming a byte store into the Context Register, 
(ASIa2, A31:28M)x3)) and doing fiiUwofd stores of zero into (ASI^Oxe, AlS:4«s0x0 to Oxfff). 

(3) To flush all r^erences to a segment from the cache, a Flush Cache (S^ment) operation must be 
perf<xmed by selecting the appropriate context and doing fuUword stc»es of zero into (ASIsOxc, 
A29:18=desiitd segment, A15:A4sOxO to OxfB). A17: 16 are ignored for this operation. 

(4) To flush all references to a page from the cache, a Flush Cache (Page) operation must be performed 
by selecting the appropriate oxitext and doing fullwcvd stores of zero iiKo (ASI^^Oxd, 
A29:12sdesiied page, Al l:4»0x0 to Oxff). 

7.4. Aliasing 

Because the cache is bigger than a page, a physical page that is mapped by two (or more) distinct 
viimai addresses could result in data from the san^ physical address ^^xsaring in two (or more) cache 
lines: 

31 29 17 11 - 

V V V V V 

I i segment (12 bits) I page I byte in page (12 bits)l 
i I cache tag (14 bits) i cache line (12 bits) I byte I 

31 29 15 3 

This situs^on cannot be detected by die hardware and must be avoided by the software. Hiere are 
'0 medKxls that may be used: 

^i) All die virtual addresses for an aliased page must be identical in bits A15:12. That is, die virtual 
addresses must be congruent modulo 64K (die cache size). This will result in die ssone cache line 
being used for the diffoent virtual addresses that map to die same physical address. This is the 
pref(»red medxxL (Note diat the hardware doesn't know diat the different virtual addresses map to 
the same physical address, and alternate use of the differrat virtual addresses will result in 
invalidating smd then refilling the cache line firom the same physical address. Also, the hardware 
autonuuically invalidates a cache line when a cache miss occurs on a write opoadon. This insures 
die consistency of die cadie with memory when aliasing via diis method occurs.) 

(2) Each PME diat points to die aliased physical page must have die "Don't Cache" bit (PME28) set. 
This method must be used if die previous method cannot 
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Description 



The L64801 integer Unit (lU) is a high performance 
CMOS implementation of the SPARC (Scalable 
Processor ARChitecture} 32bit RISC microproces- 
sor. SPARC is an open architecture which is being 
implemoited in a variety of forms by various semi- 
conductor manufacturers. This multiple sourcing al- 
lows designers to choose from a wide variety of 
price/performance options and provides a rich se- 
lection of peripherals, memory devices and propri- 
etary ASIC extensions. 

The L648Q1 features a large register file to opti- 
mize proce<^re calls, variable assignments and 
context switches. Execution speed improves signif- 
icantly because this register-to-register architec- 
ture minimizes the number of external memory 
accesses. Most of the L64801 nstructions exe- 
cute in a single cycle due to its 4-stage pipeline 
that minimizes interiocks, a bus structure that al- 
lows singie-cycle instruction/data accesses and an 
optimized branch handler. 



The L64801 can sustain 15 VAX MIPS perform- 
ance with peak performance of 25 MIPS, offer- 
ing designers the speed and power of a super 
minicomputer. 



W « « «f HI ff Wtl Its iiif 




L648Q1 Chip Photo 



Features 



High performance operation 

CommerGial 

L64801C-20 12 VAX MIPS 

L64801C-25 15 VAX MIPS 

Military 

L64801M-15 9 VAX MIPS 

L64801M-20 12 VAX MIPS 

Open architecture: 

- Multiple vendor sourced 

- Each vendor provides unique features and 
extensNNis 

- Variety of binary compatible price/performance 
optwns 

Optiinzed for operation under high-level languages 
such as C, FORTRAN, Pascal and Ada and the 
UNIX™ operating system 

Extemd MMU, memory system and floating-point 
unit assure flexible interface for the largest range 
of applicatkins and price/performance levels 



32-bit virtual address bus 

- Supports up to 4 Gbytes of direct address spa 

- Allows a variety of memory management and 
caching schemes 

Simple instruction format with fast instruction 

cycle with a 4-stage pipeline 

Single cycle execution for the majority of 

instructions 

Large central register file divided into seven 

overiapping windows of 24 registers each 

All pipeline interiocks implemented directly m 

hardware 

High performance coprocessor interface for 

concurrent execution of floating-point or other 

coprocessor instructions 

Multitasking support with user! supervisor mode 

and privileged instructions 

Artificial intelligence support through use of tagg 

instructions 

Option to use as ASIC core 

179-pin ceramic or plastic pin grid array package: 
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Rgure 1. L64801 Pinout Diagram 
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Figure 2. L64801 Functional Block Diagram 
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Introduction 



Ths LS48G1 is the first processor in the LSI 
Logic family of SPARC (Scalable Processor 
ARChitecture) microprocessors. SPARC is an ar- 
chitecture defined by Sun Microsystems which is 
based on the principjes of RISC (Reduced Instruc- 
tion Set Computer) techniques. The key feature of 
SPARC is its use of a large central register file 
which is divided into several "register windows" 
for high performance during subroutine calls and 
context switching. 

The SPARC family is supported by a full line of 
highly optimizing compilers, operating systems, 



development boards, development systems and 
development tools. 

SPARC is an open architecture, built by a number 
of semiconductor suppliers, which will provide 
rapid enhancement of features for different mar- 
kets and a wide range of price/performance 
options. LSI Logic has chosen to implement the 
L64801 using its own industry standard ASIC tect- 
niques. This allows rapid implementation of the 
L64801 design into new process technologies as 
well as the availability of the L64d01 as a micro- 
processor core within a more complex ASIC. 



Architecture 
Overview 



The L64dul SPARC chip set consists of a central 
integer unit (ID) which provides all the core func- 
tions of the SPARC instruction set as defined by 
the SPARC architecture manual. To increase per- 
formance of floating-point operations, there is an 
optional floating-point unit (FPU) and a separate 
interface chip called the floating-point controller 
(FPC). 

The lU is the primary computing element. It 
performs ail operations except floating-point op- 
erations (FPops) which are either performed in 
hardware through the FPC/FPU combination, or in 
software. The FPC/FPU provides execution of 
FPops concurrent with integer operations. 

The lU features a large central register file parti- 
tioned as sets of working registers (r registers) 
which provide storage for processes. In addition, 



there are independent control registers which keep 
track of and control the state of the III. 

There are a total of 120 32-bit registers which are 
divided into seven separate register windows. Each 
window contains 24 working registers plus eight 
global registers. 

- Address Bus 

- Data Bus 




Note: All lines shown are 32 bits wide. 
Figure 3. L84801 Core Chip Set 



<^ 



System Qata Bin 



2V 



ZX 



K={ 



TI 8S47 ^i:^:^ L6M02 
FPP ^ 



v: 



FPC 



<F=^ 



<^ 



<Z 



164801 
lU 



i FkMtmg-Point Unit I 



^ 



Cache Data 
aod Control 



4\ 



Virtual Address 



c> 



L64803 
MMU 



I 



Physical 
Address 



^ 



System Address Bus 

Figure 4. SPARC System-Level Diagram 



L64801 

High Performance 
Open Architecture 
RISC Microprocessor 

Preliminary 



LSI 



LOGIC 



Register Windows 



Perhaps the most distinguishing characteristic of 
the SPARC architecture is the overlapping register 
windows. In order to optimize operations sucii as 
subroutine calls and context switchmg, the renter 
file is divided into sets of re^ster windows. There 
are a total of eight global registers which are avail- 
able at ail times and seven register winikiws of 24 
registers each that are available at any point in 
time. These register windows overlap each other 
by eight registers on either side for parameter 
passing between processes. The register configura- 
tion at any point in time is as follows: 

RO thru R7 Global Registers 

R8 thru R 1 5 Output Parameters to Next 
Process 

R 1 6 thru R23 Local Registers to Current 
Process 

R24 thru R31 Input Parameters from Previous 
Process 

In the L64801 lU, there are a total of 120 registers 
divided into seven register windows. Tl» current 
wimkiw pointer (CWP) field withki the processor 
state register (PSR) keeps track of wNch window 
is currently active. The points is decremented 
when the processor executes a caH to the next 
window and is incremented when a return is exe- 
cuted. The windows are joined in a circiiar stack 
where tte output parameters of winttow 6 are 
coinckJent with the input parameters of window 0. 



The register file is triple ported. This allows the 
fetching of two re^ster operands and the writing 
of a destination register to occur simultaneously in 
a sin^ clock cycle. 
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Rgure 5. Example of Register Windows 
(3 Windows Shown) 



CWP 




In this figure, NWINOOWS-8. it does not snow 
the 8 ^obais. If the procedure correspondimi 
to the window labeted wO.does a procedure call 
(encutes a SAVE instruction), a wmdow_overf low 
trap wi occut The overflow trap handier uses 
the locats of w7: 



CWP-0 - active window -0 
CWP +1-1 — previous window - 1 
CWP-1-7 - nextwindow-7 
WIM - IOOOOOOO2 - trap window • 7 



Note: In LR64801 implententation NWINOOWS 
actually equal 7, not 8. 



Figure 6. Register Windows implemented as a Circular Stack 
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Four-Stage Pipeline 



The L64801 integer Unit uses a 4-stdge instruction 
pipeline comprised of; Fetch, Decode, Execute and 
Write stages. A basic singie-cycie instruction 
enters the pipeline and completes four cycles later. 
Once the pipeSne is fHied, four separata instruc- 
tions may be executing each of the following 
phases in an overiapping fashion. 

Fetch IF) 

An instruction is fetched from the bus interface 

and placed in the instruction register. 

Decode (0) 

The instruction is decoded and operands are read 
from the register file. Memory addresses are 
evaluated for loads, stores and control transfers. 



Execute (E) 

The operation specified by the instruction is 
executed and the results are saved in the 
processor's temporary registers. 

Write IW) 

The result of the executed operation is storec 
the destination register {provided that no trap 
exceptions have occurred during execution). 

The L64801 Integer Unit detects data depend 
cies and provides hardware interiocks in pipeii 
operation to properiy resolve such dependencii 
without complex software intervention. Pipeiif 
interlock occurs if an instruction fetch takes fr 
than one clock cycle. Multicycle instructions c 
the pipeline long enough to complete their 
execution. 



Bus interface 



The L64801 accesses instructions and data and 
performs system control functions through its high 
bandwidth bus interface. The bus interface has 
separate address and data lines and sets of control 
lines with protocols which support: 

- Single and muitiple-ckx:k period reads and writes 

- Full and partial-word (byte and halfword) writes 



- Multimaster bus protocols 

- Fifteen levels of external interrupt requests 

- Memory exception traps 

The lU acts as a bus slave: it has no bus grant o 
bus request circuits . It uses signals suc h as LQ C 
to lock the bus and BHOLD, MHUlD, or SHOLD i 
be locked off the bus. 



Memory/Cache 
interface 



The L54801 Integer Unit can be interfaced to a 
variety of memory subsystems: cacted, non- 
cached, virtual, physical, static, dynamic, etc. The 
processor normally expects to receive a new in- 
struction every cycle. If the memory is not fast 
enough to provide instructions at this rate, then 
wait stat es are inserted using the memory hold 
(MHOLO) inputs. In systems with non-cached mem- 
ory, every memory reiference appears to the lU as a 
cache miss. In a fast memory {cached} system, the 
bus interface protocol maximizes the advantage of 
such memory by recehnng or sendkig data during 
the same clock period m which the address is 
transmitted. Thus single-cycle reads and writes 
can be performed with sufficiently fast memory or 
peripheral devices. 



Cached memory systems should use lower order 
address bits to address cache RAMs and higher 
order address bits to compare cache tags. There_ 
no strict definition of cache sizes or tag sizes. HA 
is used to synchronize an off-chip register known 
as the cache address register {CAR) with on-chip 
address registers. CAR operates as part of the !U 
pipeline and HAL inhibits the latch. For every each 
access, the cache miss logic must send a hit or 
miss in(tication to the processor in the next cycle. 
If the cache hits, no wait state is inserted and the 
memory access completes in one cycle. 



Coprocessor Interface 



The mteger unit is the basic processing engine 
wMch executes all of the instruction set except for 
floating-point operations. Software for non- 
floating-pokit intensive applications is supported. 
Where high performance floating-point is desirable, 
a ftoatmg-point controller IFPCj and lU operate 
concurrently The FPC recognizes floating-point 
instructions and places them in a queue while the 
lU contmues to execute non-floating-point instruc 
tions. If the FPC encounters an instruction which 
will not fit in its queue, the FPC holds the lU until 



the instruction can be stored. The FPC contains its 
own set of registers on which it operates. The 
contents of these registers are transferred to and 
from external memory under control of the lU via 
floating-point load/store instructions. Processor 
interlock hardware hides floating-point concurrencv 
from the compiler or assembly language program 
mer. A program containing floating point comouta 
tion generates the same results as if instructions 
were executed sequentially. 
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Special Purpose 
Registers 



The integer unit contains six 32-bit special purpose 
control/status renters whidi are used for general 
program control, setting moctes of operation and 
showing processor status. 

Processor status register (PSR) contains fields 
describing the state of the III. 



impl(31:28} 


Implementation Number of the 




Processor 


ver(27:24) 


Version Number of the 




Processor 


lcc(23:20) 


Integer Condition Codes 




In, z, V, c) 


reserved(T9:14) 


Reserved for Future Options 


EC(13) 


Enable Coprocessor 


ER12) 


Enable Floating-Point Unit 


PIL{n:8) 


Professor Interrupt Level 


SI7) 


Supervisor Mode 


PS(6) 


Prior S-Bit (held at time of trap) 


ET(5) 


Enable Traps 


CWP|4:0I 


Current Window Pointer (marks 




current reg window) 



Program Counter and Next Program Counter 

(PCandNPC) 
PC contains the address of the instruction 
currently being executed by the lU. NPC holds 
the addiWs of the next instruction to be 
executed (except when a trap occurs). 



Window Invalid Mask Register (WIM) 
WIM is used to determine whether a window 
overflow or underflow trap should be generated. 
Each bit of the WIM corresponds to a single 
register windoML For the L64801 with seven 
register windows, only WIM(6:0) are used. 

Trap Base Register (TBR) 
TBR contains three fields that generate the 
address of the trap handler when a trap occurs. 
T6 A(3 1:12) Trap Base Address (most 

significant 20 bits of trap table 
address) 
tt( 1 1 :4) Trap TyjK, provides offset into the 

trap table 
zero(3:0) Zero 

Y Register 
The Y register is used by the multiply step 
instructnn to hold 32bit results and create 
64bit products. 

Control/status registers contam two types of 
fields, mode and status. Mode fields are set by the 
programmer and are desisted tivough the use of 
m ufqier-case nanmg conventkm. Status fields are 
set by the pm^ssor aiKJ use a lower-case naming 
convoition. 



Exception Handling 



The LB48G1 generates traps in response to both 
internal (synchronous) and external (asynchronous) 
events. These traps switch control from the 
instructHHi stream to ai address in a trap tabte 
(exc^ a reset trap whN:h transfers control to 
virtual address 0). Synchronous traps occur 
immeittately, not wating for the current instruction 
to be completed. Asynchronous traps wait for the 
currently executing ^ruction to complete before 
they occur. 

Each type of trap is asstgiwd a priority; when 
muhipie traps occw; the highest priority trap is 
taken and tower priority traps are ignored. To be 
taken, the request for the kiwer priority trap must 
either persist or be repeated. 

Traps are vectored. The trap base address (TBA) 
register points to the trap table. Interrupts are 
given to tte processor using four interrupt input 
signals. Any signal other than zero on these inputs 
is interpreted by the processor as an external 
interrupt request. This value is compared with the 
current processor interrupt level in the processor 
status register (PSR). The interrupt is taken if the 
external interrupt request level is greater than the 
processor interrupt level. The highest level inter 
rupt (level 1 5) is nonmaskable. When a trap is 



detected, the processor takes the following 
actions: 

1. The program counters con^esponding to the 
trailed instriKtion and the instruction following 
the trapped instruction are saved in the register 
file. 

2. The execution of the trapped instruction is 
aborted and ^1 fetched but unfinished instructions 
are flusfwd out of the pipeTim. 

3. AH traps are disaUed. The prwessor mode is set 
to superuser and the CWP is set to point to the 
next window. 

4. Tf» trap address, based on the contents of the 
TBR and tt registers, is computed and loaded into 
the program counter. 

5. Execution is restarted from the new trap 
address. 

All external interrupts are ignored when- traps are 
disabled. If a synchronous trap is detected while 
traps are disabled, the lU enters into an error mode 
and remains in that mode until the processor is 
reset externally. At reset, the processor enters into 
an initial state and starts execution from address 
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Instruction Categories 



The L64801 instructions fail into five basic 
categories: 

Load and Store instructions. (The only way to 
access memory). These instructions use two regis- 
ters, or a register and a sighed immediate value to 
generate the memory address. Integer load and 
store instructions support 8-, 16-, 32- and 64-bit 
accesses while floating-point instructions support 
32- and 64bit accesses. 

Load/Store Signed 8yl€ 

Load/Store Signed Halfword 

Load/Store Unsigned Byte 

Load/Store Unsigned Halfword 

Load/Store Word 

Load/Store Double Word 

Load/Store Floating-Point Registers 

Load/Store Double Floating-Point Registers 

Load/Store Floating-Point State Register 

Store Double Floating-Point Queue 

Arithmetic/Logicai/Shift instructions. These 
instructions compute a result that is a function of 
two source operands and then write the result 
back into a destination register. They perform 
arithmetic, tagged arithmetic, logical or shift 
operations. Tagged instructions are useful for 
implementing artificial intelligence languages such 
as LISP because tags provicte interpreters with the 
type of arithmetic operands. 
Add (w/wo modifying condition codes) 
Add with Carry (w/wo modifying condition codes) 
Tagged Add (w/wo trap on overflow) 
Subtract (w/wo modifying condition codes) 
Subtract with Carry (w/wo modifying condition 

codes) 
Tagged Subtract (w/wo trap on overflow) 
Multiply Step (modify condition codes) 
AND (w/wo modifying condition codes) 
NANO (w/wo modifying condition codes) 
QR (w/wo modifying condition codes) 
NOR (w/wo modifying condition codes) 
Exciu^ve-DR (w/wo modifying condition codes) 
Exciusive-NOR (w/wo modifying condition codes) 
SMft Left Logical 
Shift Right Logical 
Shift Right Arithmetic 
Set High 22 Bits of Register 

Coprocessor Operations. These include floating- 
point calculations, operations on floating-point 
registers and instructions involving the optional 
coprocessor. Floating-point operations execute 
concurrently with lU instructions and with other 



floating-point operations when necessary. This 

architectural concurrency hides floating-point 

operations from tte applications programmer. 

Convert Integer to Single/Double/Extsnded 

Precision 
Convert Singie/Double/Extended Precision to 

Integer (w/wo rounding) 
Convert Single Precision to Double/Extended 

Precision 
Convert Double Precision to Single/Extended 

Precision 
Move/Negate/Absolute Value 
Square Root Single/Double/Extended 
Add Single/Double/Extended 
Subtract Singie/Oouble/Extended 
Multiply Single/Double/Extended 
Divide Single/Oouble/Extended 
Compare Single/Oouble/Extended 

(w/wo exception if unordered) 

Controj-Transf er instructions. These include 
jumps, calls, traps and branches. Control transfers 
are usually delayed until after execution of the next 
instruction, so that the pipeline is not emptied 
every time a control transfer occurs. Thus, 
compilers can be optimized for delayed branching. 
Branch and call instructions use program counter 
relative displacements. A jump and link instruction 
uses a register indirect displacement computing its 
target address as either the sum of two registers, 
or the sum of a register and a 13-bit signed 
immediate value. The branch instruction provides 3 
displacement of eight megabytes and the caii 
instructions 30-bit displacement allows transfer to 
any address. 

Increment Current Window Pointer 

Decrement Current Window Pointer 

Branch on Integer Condition Codes 

Trap on Integer Condition Codes 

Branch on Floating-Point Condition Codes 

Call 

Jump and Link 

Return from Trap 

Read/Write Control Register Instructions. 

These include instructions to read and write the 
contents of various control registers. Generally the 
source or destination is implied by the instruction 

Read/Write Multiply Step Register 

Read/Write Processor State Register 

Read/Write Window Invalid Mask Registe"- 

Read/Write Trap Base Register 

Flush Instruction Cache 
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Instruction Execution Times. All instructions 
execute in a single cycle except the following 
instructions: 
instruction Type Cycles 

Load (word/halfword/byte) 2 

Load (double) 3 

Store (word/halfword/byte) 3 



Instruction Type 


Cycles 


Store (double) 


4 


Atomic Load and Store 


4 


Floating-Point Ops 


2 + Cf 


Jump and Rett 


2 


Branch (taken) 


1 


Branch (untalcen) 


2 


AU Other Instructions 


1 



istruction Set 
•ummary 



Opcode 


Name 


LOSB (LOSBAt) 


Load Signed Byte (from Alternate 




Space) 


LOSH (LDSHAtI 


Load Signed Halfword (from 




Alternate Space) 


LOUBILOUBAt) 


Load Unsigned Byte (from 




Alternate Space) 


LDUH ILOUHAt) 


Load Unsi^ Haifword (from 




Alternate Space) 


LD ILOAt) 


Load Word (from Alternate Space) 


LOO (LOOA)t 


Load Ooubleword (from Alternate 




Space) 


LOF 


Load Floating-Potnt 


LOOF 


Load Ooubie FtoatingPoint 


LOFSR 


Load Floating-Point State 




Register 


LDC 


Load Coprocessor 


LOOC* 


Load Double Coprocessor 


LOCSfl^ 


Load Coprocessor State Renter 


STB(STBAt) 


Store Byte finto Alternate Space) 


STH (STHAtI 


Store Halfword fmto Alternate 




Space) 


SKSTAtI 


Store Word (into Alternate Space) 


STO(STOAt) 


Store OouUeword fmto Alternate 




Space) 


STF 


Store Floatingpoint 


STDF 


Store Ooubie Floating-Point 


STFSR 


Store Floating-Point State 




Register 


STOFQt 


Store OoMt Floating-Ptwit 




Queue 


SIC 


Store Coprocessor 


STOC* 


Store Ooubie Coprocessor 


STCSR* 


Store Coprocessor State Register 


STDCQf 


' Store Ooubie Coprocessor Queue 


LOSTUB (LOSTUBAtI 


Atomic Load-Store Unsiyied Byte 




rm Alternate Space) 


SMPlSWAPAt) 


Swap r Register with Memory (in 




Altemate Space) 


ADO (AOOcc) 


AddlandModHyicc) 


AOOX (ADOXcc) 


A(kl with Cvry (and Modify ice) 


TAOOcc (TAOOccTV) 


Tagged Add and Modify ice (and 




Trap on Overflow) 


SUB (SUBcc) 


Subtract (and Modify ice) 


SUBX (SUBXcc) 


StAtract with Carry {md Modify 
ice) 


TSUBcc (TSUBccTV) 


Tagged Subtract and Modify ice 




{and Trap on Overflow) 



Opcode 


Name 


MULSce 


Multiply Step and Modify ice 


ANO (ANOcc) 


And (and Modify ice) 


ANON (ANONce) 


And Not (and Modify ice) 


OR (ORcc) 


Inclusive-Or (arKi Modify ice) 


ORN (ORNce) 


Inckisive-Or Not (and Modify ice) 


XOR (XORcc) 


Exckisive-Or (and Modify ice) 


XNOR (XNORcc) 


Exciusive-Nor (and Modify ice) 


SLL 


Shift Left Logical 


SRL 


SNft Right Logical 


SRA 


Shift Right Arittvnetic 


SETHI 


Set High 22 Ms of r register 


SAVE 


Save Caler's Window 


RESTORE 


Restore Caler's Wmdow 


Bice 


Branch on Integer Condition 




Codes 


FBfcc 


Branch on Floating-Point 




ConditioR Ixodes 


CBccc 


Branch on Coprocessw Condition 




Codes 


CALL 


Can 


JMPL 


Jump and Link 


REHt 


Return from Trap 


ricc 


Trap on Imager Condition Codes 


ROY 


Read Y Register 


ROPSRt 


Read Processor State Register 


ROWIMt 


Read Window Invalid Mask 




Register 


RDTBRt 


Read Trap Base Register 


WRY 


Write Y Register 


WRPSRt 


Write Processor State Register 


WRWIMt 


Write Window ImraU Mask 




Renter 


WRTBRt 


Write Trap Base Register 


UNIMP 


Unimpiem«itHl Instruction 


IFLUSH 


Instruction Cache Flush 


FPop 


Fk)ating4>oint Operate: FiTOls. d. 




x).F(s,d.x)TOi 




FsTOd. FsTOx. FdTOs. FdTOx. 




FxTOs, FxTOd, FMOVs. FNEGs. 




FABSs, FSQRT(s. d. x), FAOOIs. 




d. x), FSUBIs, d, xl. FMULIs. d. | 




x), FOIV(s, d, X), FCMPIs. d. il. 




FCMPE(s, d, xl ' 


CPop 


Coprocessor Operate 



'Unimplemented Instruction 
t Privileged Instruction 
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InstructiOR Fornists 
(Summary) 



Format 1: CALL 



op 



dis(i3Q 



31 29 

Format 2: SETHI and Branches {Bice. FBfcc, CBcc) 



op 


rd 


op2 


iiiiffl22 


op 


a 


eofld 


op2 


disp22 



31 a 28 24 21 

Format 3: Remaining Instructions 



op 


rd 


op3 


rsl 


i 


asi 


rs2 


op 


rd 


op3 


r$l 


i 


simn:13 


op 


rd 


•p3 


rsl 


opf 


rs2 



29 



24 



18 



13 12 



Instruction Format 
Field Definitions 



op 

This field places the instruction into one of tlie 

three major formats: 



Use of op Reid 






Format 


opVaiM 


instriKtioa 


1 
2 

3 


1 


2or3 


Cal 

Bice, FBfcc. CBccc, 

SETHI 

Other 



op2 

This field comprises bits 24 through 22 of format 2 
instructions. It selects the instruction as follows: 

Use of op2 Reld 



0|i2VaiM 


InstnictiM 





UNiMP 


2 


Bice 


4 


SETHI 


6 


FBfcc 


7 


CBccc 



rd 

For store instructions, this register selects an 
r register (or an f register pair), or an f register (or 
an f register pair) to be the source. For ail other 
instructions, this field selects an r register (or an f 
register pair), or an f register (or an f register pair) 
to be the destination. 

Note: Reading r(0] produces the result 0, and writing it 
causes the result to be discarded. 



The "a" bit means "annul" in format 2 instruc- 
tions. This bit changes the behavior of the instruc 
tion encountered immediately after a control 
transfer. 

cond 

This field selects the condition code for format 2 
instructions. 

imm22 

This field is a 22-bit constant value used by the 
SETHI instruction. 

disp22 and dlsp30 

These fields are 30-bit and 22-bit sign-extended 
word displacements, for PC -relative calls and 
branches, respectively. 

op3 

The op3 field selects one of the format 3 opcodes. 



The i bit selects the type of the second ALU 
operand for non-FPop instructions. If i-0, the 
second operand is r[rs2]. If i- 1, the second 
operand is sign-extended simm13. 

asi 

This 8-bit field is the address space identifier 
generated by load/store alternate instructions. 
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Instructiofi Format 
Field Definitions 

(Continued) 



rsl 

This 5bit field selects the first source operand 
from either the r registers or the f registers. 

rs2 

This 5-bit field selects the second source operand 
from either the r registers or the f registers. 



simm13 

This field is a sign-extended 13bit immediate value 
used as the second ALU operand when i - 1. 

opf 

This 9-bit field identifies a floating-point operate 
(FPop) instruction or a coprocessor operate (CPop) 
instruction. 



Pin Descriptions 



The signals on the L64801 are divided into three 
main categories: memory subsystem interface 
sisals, floating-point unit interface signals and 
misceiiai^ous I/O signds. Signals which are 
asserted LOW are indicated by an overscore. 

Memory Subsystem interface Signals 

mm 

Address Bus 

The address bus is output directly from an on-chip 
meiMry aiMress register and is valid every cycle. 
During an instructimi fetch cycle, the Ihjs carries 
dm Mistruction address, and (kiring a load or store 
data cycle, it carries a data acMress. The adt^ss 
bus remains valid during ^ data cycles of loads, 
stores, load doubles and atomic load/stores. In 
systems with cache, the low bits of the address 
are used to read the cache RAMs and cache TAGs, 
and the high bits of the address are used to 
compare tiK TAGs. 

ASI(7il] 

Address Space Identifier 

These bits identify the address sp%e (hiring 
instruction or data accesses. The v^ue of these 
signals at any given cyde represents the address 
space contaning the memory address specified by 
A(31:0] during that cyde. ASi{7:0] remains valid on 
the bus during aN data cycles of loads, stores, load 
doubles, and atomic lo^stores. ASIt7:0| pins are 
3-stated if AOE is deasserted. The following ASI 
vakies are currently assigned: 



ASI 


Address Space 


00001000 


User Instruction 


00001010 


User Data 


00001001 


Supervisor Instruction 


00001011 


Supervisor Data 



During the data cycles of altemate load and store 
instructions, ASi(7:0] carries the space identifier 
specified by the instruction opcode. 

0(31:01 
Data Bus 

The bidirectional data bus to and from the lU. It is 
driven by the lU only during the execution of 
integer store instructions or during the store cycle 



of atomic load/store instructions. It is driven by the 
FPC only during the execution of floatingpoint 
store instructions. The alignment for load and store 
instruction is done inside the lU, which always 
expects instructions to be fetched from 32-bit wide 
memory. 



MEXC (Asserted LOW) 
Memory Exception Input 

The n^mory or cache controller asserts this signal 
to signal an instruction-access-exception, or a 
data-access-exception. It is latche d in the lU and 
used (kiring the folkiwing cycle. If MEXC is 
asserted (kiring an instruction fetch cycle, the III 
genera ted an instruction access exception. If 
JMEXC is asserted during a data fetch cycle, the III 
ger^ates a data access exception trap. 



MHOLDA. MHOLOB, MHOLDC. SHOLD 
(Asserted LOW) 
Hold From Memory 

These signds freeze the processor pipeline as long 
as my of them are asserted. They are used to 
freeze the ckxk to tte lU and FPU during a cache 
miss (for system with cache), or when accessing a 
slo w memor y . The lU hardware u ses t he logic al OR 
of MHOLDA, M HOLDB, MHOLDC, and SHOTD to 
generate a final MHOLD for freezing the processor 
pipeline. 



BHOLO (Asserted LOW) 
Hold From I/O System 

The I/O controNer asserts this signal when an 
external bus master needs the data bus. This signal 
freezes the processor {Npeiine. External logic should 
guarantee that the dat a on the inputs to the I U is 
the sa me after BHOLO is disasserted as it was 
before BHOLO was asserted. 

DOE (Asserted LOW) 
Data Bus Out|Mit Enable 

This signal turns on the output drivers to the 
D(31:0] bus. It is connected directly to the i^tven 
and therefore must normally be asserted. It may tM 
disasserted only when the bus is to be used by 
another bus maste r . This sho u ld only oc cur ^^nen 
BHOLD , MHOLDA, MHOLDB, MHOLDC. or 
SHOLD is asserted. 
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AOE (Asserted LOW) 
Address Bus Output Enable 

This signal enables the A(31:0] outputs, it is 
normaliy asserted except when the bus is to be 
used by another bus master. 



ASIOE (Asserted LOW) 

Addre ss Space indentifier Output Enable 

ASIOE enables the ASi outputs. It is normaliy 
asserted except when the bus is to be used by 
another tnis master. 



MDS (Asserted LOW) 

Memory Data Input Strobe During Hold 

This signai enables the doci( input to the on-chip 
instruction register (during an instruction fetch), or 
to the load result register (during a data fetch}. It is 
used in systems with cache or with slow memory, 
to signal tte processor when data is ready on the 
bus. it should only be asser t ed when t h e process or 
pip eline is frozen (MMDA, MHOIDB, MHOLDC, 
or SHQLO is asserted). 

TC (Asserted LOW) 
Trap Condition 

The state of this signaijcpntrois the behavior of the 
(FLUSH instruction. If TC is HIGH, (FLUSH exe- 
cutes fike NOP with no side effects, if TC is LOW, 
IFLUSH causes an unimplemented instruction trap. 

siziim 

Data Bus Transfer Size 

SIZE represents the data size of the memory 
address currently on A[31:0]. They remain valid on 
the bus during all data cycles of loads, stores, load 
doubles, store doubles, and atomic load/stores. 
Tiny are encoded as follows: 



Size 1:0 

00 
01 
10 

11 



Data Size 

Byte 

Hatfword 

Word 

Word for LOOF, 

STOF, and STDFQ 



LOST 
Load/Store Cycle 

TNs signal is asserted during a!! data cycles of 
atomic load/store instructions. LOST is 3-stated if 
AQE is disasserted 

RO 

Read Cycle 

This signai is set LOW during data cycles of store 
instructions jincluding the store cycles of atomic 
load/store instructions). In conjunction with 



S!ZE{l:Oi, ASii7:0], and LOST, it can be used to 
determine the type of a bus transaction, and to 
check read/write access rights. RO may also be 
used to turn off the output drivers of data RAMs 
during a store operation. For atomic load/store 
instructions, RO is HIGH during the first data 
(read) cycle, and LOW during the second and third 
data (write) cycles. RO is S-stated if AGE is 
disasserted. 

WE (Asserted LOW) 
Write Cycle 

This signai is asserted only during 1) the second 
data cycle of store instructions, 2) the second and 
third data cycles of store double instructions, or 
3) the third data cycle of atomic load/store 
instructions. This signai is 3-stated when not 
asserted. 

NULL_CYC 
Null Cycle 

This signal indicates that the current memory 
address (whose address is held in the external 
memory address register) is nullified by the iU. It is 
used to disable cache miss in systems with cache, 
and for memory exception handling during the 
current memory access. 



IH^NULL (Asserted LOW) 
Null Cycle Reset 

When active, this signal resets NULL CYC to 

LOW 

LOCK 

Bus Lock Request 

LOCK is set HIGH when the IU needs the bus for 
multiple-cycle transactions. The bus may not be 
granted to another bus master as long as LOCK is 
active. 

HAL (Asserted LOW) 
Hold Address Latch 

HAL freezes the clock to the external memory 
address register. It is asserted during the execution 
of some multiple-cycle instructions, internal 
interioc ks and wh e never at le ast one of the h old 
signals (MHOLDA, MHOLDB, SHOLD. BHOLD, or 
FHOLO) is asserted. 

DFETCH 

Oata Fetch Cycle 

OFETCH marks the beginning of a data cycie Ahen 
DFETCH is HIGH, it indicates a data cycle and 
when DFETCH is LOW, it indicates an instruct-cn 
cycle. The IU can nullify an instruction or data 
cycle by asserting NULL CYC. 
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Floating-Point Unit Interface Signals 

The floating-point unit interface is a dedicated 
group of connections between the lU and the FPC 
and no externa) circuits are required. The interface 
consists of the following signals: 

FP (Asserted LOW) 
Floating4>oint Unit Is Present 

When FP is LOW, it indicates that an FPU exists in 
the system. FP is tied to VDO by an internal resis- 
tor and is pulled to ground only when the FPU is 

present. The lU generates an fp disabled trap if 

FP is HIGH during the execution of a floating-point 
instruction, a floating-point load or store, or an 
FBfcc. 

FCCIHI 

Condition Code Inputs 

The floating-point condition codes are valid only if 
FCCV is HIGH. An FBfcc instruction uses these bits 
to compute the next instruction address, and then 
waits if FCCV is LOW 

FCCV 

Condition Codes Valid 

The FPU asserts FCCV to indicate that FCC(1:0] 
are vaid. The FPU must guarantee that FCCV is 
LOW (disasserted) if floating-point compare instruc- 
tions are peniting in the floating-point queue. 



FHOLD (Asserted LOW) 
Hold Input 

The FPU asserts FHOLD wlwn it cannot continue 
executing instructions. When it receives an instruc- 
tion, the FPU ctecks for d tepende ncies, and if any 
are (fiscovered, it asserts FHOLD Airin g the s ame 
cyde or during the cyde that follows. PHOLD is 
latched mto the lU, wl^e it freezes the mstruction 
pipefci einthe foflowing cyde. The FPU must dis- 
assert FHOLD to unfreeze the lU's instruction 
pipeline. 

Fixe (Asserted LOW) 
Excoption Input 

The FPC Kserts FEXC to mdicate that a floating- 
point exceptnn has occurred. It must remain asser- 
ted until the lU takes the trap and acknowledges 
by asserting FXACK. Floating-point exceptions are 
only taken during execution of floating-point 
instructions. 



Roating-Point Bus 

This dedicated 32-bit bus sends floating-point in- 
structions and addresses to the FPU chip. Each 
floating-point instruction uses this bus for two cy- 
cles; the first cycle carries the instruction and the 
second cycle carries the address. 

FINS 

Floating-Point Instruction 

The iU asserts FINS during the cyde in which 
F[31:00] carries a valid floatingpoint instruction. 
The FPU uses this signal to latch the instruction 
into its instruction register. 

FAOR 

Floating-Point Address 

The IU asserts FADR during the cycle in which 

F(31:00] carries a valid floating-point instruction 

address. The FPU uses this signal to latch the ad- 

(kess mto its address register. 

FEND 

End Floating-Point Instruction 

The IU generates FEND, which the FPU uses to 
syiHihroniK the instruction/address in its execution 
pipeline with the lU's pipeline. The IU asserts FENO 
during the last cyde of a floating-point instruction 
in the lU's pipeline. 

FLUSH 

Flush Floating-Point Instruction 

The IU asserts FLUSH to cause the FPU to flush 

the instruction in its instruction register. This may 

happen when the IU takes a trap. FLUSH has no ef 

feet on instructions in the floating-point queue. 

FXACK 

Exc^itton Acknowledge 

The IU ass erts FX ACK to indicate to the FPU that 
the current FE XC tra p has been taken. The FPU 
must disassert FEXC after it receives FXACK so 
that the next floating-point instruction does not 
cause a repeated floating-point exception trap. 



L64801 

High Performance 
Open Architecture 
RISC Microprocessor 

Preliminary 



l:51 




Pin Descriptions 

(Continued) 



Miscellaneous I/O Signal Descriptions 

These signals are used by the III to control external 
events or to receive input from external events. 



RESET (Asserted LOW) 
Reset in|Hit 

Asserti on of this pin will reset the Integer Unit. The 
RESET signal must be asserted for a minimu m of 
eight processor dock cycles. After a RESET, the 
Integer Unit will start fetching from address 0. 

IRL[3d}] 

Interrupt Request Level 

The value on IRL defines the external interrupt re- 
quest level. When iRLI3:0]-0000, no interrupts are 
pendng. External interrupts must be latched and 
prioritized by external logic before they are passed 
to the iU and held until they are acknowledged by 
the IU. External mterrupts must be acknowledged 
by software. 



ERROR (Asserted LOW) 
Processor In Error State 

When the IU detects a trap while the ET bit in the 
PSR is 0, the processor saves the PC and NPC, 
sets the tt val ue in the TBR, enters into an error 
state, asserts ERROR and halts. To restart the 
proces sor from this state, external togic should 
send a RESET to the chip. 



CLX 
Clock Input 

The rising edge of CLK defines the beginning of 
each pipeline stage in the IU chip. CLK can have 
any duty cycle ranging from 30% to 70%. 

XSM 

Scan Mode Input 

During test and debug, this signal disables the nor- 
mal clocks and activates the scan docks for scan 
operations. XSM must be set HIGH during normal 
operation. 

SDO 

Scan Data Output 

SOO is the serial data output for the lU's scan 
path. 

PTREEO 

Parametric Tree Output 

Tins signal is the output of an internal test string, 
which test parametric input levels during test. 
PTREEO is 3-stated when XSM is set HIGH. It 
need not be connected for normal operation. 
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PmName 


Descriptioii 


InputiQutpttt 


Active 


A 131:0) 


Address 


3State Output 




ASi (7:0) 


Address Space Identifier 


3-State Output 




0(31:0) 


Data 


3-State Bidirectional 




HAL 


Hold Address Latch 


Output 


LOW 


WE 


Write Enable 


Output 


LOW 


RO 


Read 


Output 


HIGH 


OFETCH 


Data Fetch Cycle 


Output 


HIGH 


SIZE (1:0) 


Bus Transact Size 


3-State Output 




LOCK 


Multi-Cycle Bus Lock 


3State Output 




MOS 


Memory Data Strobe 


Input 


LOW 


AOE 


M^ess Output Enable 


Input 


LOW 


ASIDE 


ASI Output Enable 


Input 


LOW 


OOE 


Data Output EnaUe 


Input 


HIGH 


MHOLOA 


MeinoryHoidA 


Input 


LOW 


MHOLOB 


Memory Hold B 


Input 


LOW 


MHOLOC 


Memory Hold C 


Input 


LOW 


BHOLO 


Bus Hold 


li^ 


LOW 


SHOLO 


System Hold 


Input 


LOW 


IRL (3:0) 


Interrupt Request Level 


Input 




RESET 


Reset 


Input 


LOW 


TC 


Tr^ Condition 


Input 


LOW 


MEXC 


Memory Exctptipn 


Input 


LOW 


ERROR 


lU Error Mode 


Output 


LOW 


LOST 


Load/Store Operation 


3-State Output 


HIGH 


NULL_CYC 


Nul Cycle 


3-State Output 


HIGH 


IH_NULL 


l\iuil Cycle Reset 


Input 


LOW 


PTREEO 


Parametric Tree Output 


Output 




TSTO 


Test Output 


Output 




XSM 


Scan Mode Input 


Input 


HIGH 


FINS 


Floating-Point Instruction 


3State Output 


HIGH 


FAOR 


Floating-Point Address 


3-State Output 


HIGH 


FENO 


End Floating-Point Instruction 


3-State Output 


HIGH 


FLUSH 


Flush Floating-Point Instruction 


3State Output 


HIGH 


FXACK 


Floating-Point Excmtion Acknowledge 


3-State Ou^ 


HIGH 


FP 


Floating-Point Unit Present 


Input w/Pulup 


LOW 


FCCV 


FPU Condition Codes Valid 


Input 


HIGH 


FCC (1:0) 


FPU Condition Codes 


Input 




FHOLO 


FPU Hold 


Input 


LOW 


FEXC 


FPU Exception 


li^ 


LOW 


F(31:0) 


Floating-Point Bus 


3State Output 




CLK 


System Clock 


Input 




VOD 


Input Circwt Power 


Power 


1 


GNO 


Input Circuit Ground 


Ground 
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Absolute Maximum Ratings (Referenced to VSS) 



Parameter 


Symfagl 


Limits 


Unit 


OC Supply Voltage 


VDO 


-0.3 to* 7 


V 


input Voltage 


ViN 


-Q.3ts\^00*G.3 


11 


OC Input Current 


ItN 


±10 


mA 


Storage Temperature 
Range (Ceranic) 


TSTG 


-65 to +150 


'C 


Storage Temperature 
Range (PlastK) 


TSTG 


-40 to* 125 


»c 



Recommended Operating Conditions 



Parameter 


SyoHMi 


Limits 


Uni 


OC Supply Voltage 


VOO 


♦ 3 to +6 


V 


Operating Annient 
Temperature Range 
Military 


TA 


-55 to* 125 


"C 


industrial Range 


TA 


-40to*85 


°c 


Conrnerdal Range 


TA 


to * 70 


"C 



DC Characteristics: Specified at VOO -5 V ±5% ambient temperature over the specified temperature range 



(1) 



Notes: 

1. Military temperature range is -55°C to * HS^C, 1 10% power supply (ceramic packages only); industrial temperature range is -40°C to * 35°C. : 5S power 
supply: commercial temperature range is 0°C to 70°C, ± 5% power supply. 

2. Requires two output pads. 

3. Type 84 output. (Jutfut short circuit current for other outputs will scale. Not more than one output may be shorted at a time for a maximum duration of one second 

4. Not applicabte to assigned bidirectional buffer (excluding package). 

5. Output using single buffer structure (excluding package). 



Symbol 


Parameter 


Conclltion 


Min 


Typ 


Max 


Unit 


VIL 


Vottage Input LOW 

TIL Inputs 
CMOS Levels 








0.8 
1.5 


V 
V 


ViH 


Voltage Input HIGH 

TTL inputs. Commercial 
Temp^ature Range 

HL inputs. Military and 
Industrial Tetnperature Range 

CMOS Levels 


- 


2.0 
2.25 
3.5 






V 
V 
V 


VT* 


Schmitt-Trigger, Positive-going Threshold 






3.0 


4.0 


V 


VT- 


Schmitt-Trigger, Negative-going Threshokj 




1.0 


1.5 




V 


ir 


Hysteresis. Schmitt Trigger 


VIL to ViH 
ViH to VIL 


1.0 


1.5 




V 


UN 


input Current, CMOS, TTL inputs 

inputs with Pulldown Resistors 

TTL inputs & Inputs with PuHup Resistors 


VIN-VODorVSS 
VIN - VOO 
VIN - VSS 


-10 

10 

-8 


±1 

35 

-30 


10 

120 

-100 


^A 


VOH 


Voltage Output HIGH 

Type 81 
Type 82 
Type 84 
Type 86 
Type 88^,^ 
Type812<2J 


Cumni 


Mil 


2.4 


4.5 




i 


lOH- 
lOH- 
lOH- 
lOH- 
lOH- 
lOH- 


-1mA 
-2 mA 
-4 mA 
-6 mA 
-8 mA 
-12 mA 


-0.8 mA 
-1.6 mA 
-3.2 mA 
-4.8 mA 
-6.4 mA 
-9.6 mA 


■ 

1 


VOL 


Vottage Output LOW 

Type 81 
Type 82 
Type 84 
Type 86 
Type 88^ 
Type 812® 


Comm 


My 




0.2 


0.4 




lOL- 
lOL- 
lOL- 
lOL- 
lOL- 
lOL- 


1mA 
2mA 
4 mA 
6 mA 
8 mA 
12mA 


0.8 mA 
1.6 mA 
3.2 mA 
4.8 mA 
6.4 mA 
9.6 mA 


1 
V 


iOZ 


3-State Output Leakage Cirr^t 


VOH - VSS or VDO 


-10 


±1 


10 


uA i 


iOS 


Output Short Circwt Current*^ 


VOO - Max, VO - VOO 
VOO - Max, VO - V 


15 
-5 


50 
-25 


130 
-100 


mA 
mA 


100 


Quiescent Supply Current 


VIN -VOO or VSS 


Use 


r-Oesign Oepenc 


lent 


! 


, CIN 


Input Capacitance 


Any Input**' 




2 




pF ^ 


CCUT 


Output Capacitance 


Any Output'^' 




4 




pF i 
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Load Transactions 



Figure S shows the timing for a load integer in- 
struction. This instruction causes a one-cycie 
delay; during T4, the bus contains the datum to be 
loaded and the processor cannot use it to fetch 14. 
Because of ttus delay, 14 is fetched during T5. 

The delay also gives the iU time to deal with any 
trap caused by II. 

Figure 9 shows the tkning for a load double integer. 
TNs works similarly to the load integer, except that 
it uses the bus during T4 to load the first half and 
during T5 to load the second. Note that the ad- 



dress of the second load is equal to the address c 
the first load +4 and that the size bits » 1 , 1 durii 
T4 and T5. The processor fetches 14 during T6. 

Figure 10 shows the timing for a load floating-poi 
instruction. It works like the load integer except 
that is also generates floating-point control signal 
inT3,T4andT5. 

Figure 1 1 shows the timing for a load double float 
ing-point instruction. It works like the double inte- 
ger instruction except that it generates additional 
floating-point signals T3, T4 and T5. 
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Rgure 12 shows a store integer instruction; these 
take two extra cycies. During 14, the address of 
the store goes on Xbe bus; during 15, the address 
remains on the Ihis and store data goes on the bus 
as weV. This requires two extra cycies because the 
processor cannot send both the address and the 
data out simultaneously, and because the proc 
essor has to wait to see if the store is going to 
generate an exception or a cache miss. It fetches 
14 during T6. 

Figure 13 shows the timing for a store double inte- 
ger instruction, it woriis like the store integer tim- 
ing except that the processor must delay an extra 
cycle to repeat the store operation for the second 



word. Note that the address of the second store 
equal to the first address +4. and that the size t 
are set to 1,1 to indicate a double operand. 

Figyre U shows the timing for a floating-point 
store. This works similarly to the integer store, 
except that it generates the additional floating 
point signals, FINS, FADR, and FEND during 13, " 
and T6. 

Figure 15 shows the timing for a store double f'.oi 
ing-point instruction. It works just like the store 
floating-point instruction except that it requires ai 
extra cycle to store the second half of the floatim 
point operand. 
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Figure 13. Store Double Integer Timing 
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Atomic transactions consist of two or more steps 
which are indivisible; once the sequence is started, 
it caraiot be interrupted. To ensure that it has the 
bus for the second transaction, the !U asserts 
LOCK for as long as necessary. 



The atomic load and store unsigned byte is the onl\ 
atomic transaction currently supported. It takes 
seven cycles and is described in the The SPARC 
Architecture Manual. 

Rgure 16 shows an atomic load and store unsignec 
byte. 
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Rgwe 16. Atomic Load-Store Unsigned Byte Timing 



Floating-Point 
Operations 



The lU fetches and decodes FPops, then broad- 
casts them to the FPU controller over the floating- 
point bus (R31:01). It also provides control signals 
to inform the FPU controller when an FPop is de- 
coded. Ourkig an FPop, the lU puts the instruction 
on the floating-point bus during the execute cycle 
and puts the instrjcticR address or. the 
point Imjs during the write cycle. 






The FP U controller stops the lU by asserting 
FHOLO if it detects a condition that requires 



it to delay executing the current floating point m 
struction. This can happen under the following 
comfitions: 

When a store FSR instruction starts execution and 
FPops are pendng in the floating-point queue. In 
this case, th e FPU e ontroller detecf s the condition 
»id asserts FHOLD. The store FSR mstrucnon 
must wait until all pending FPops complete 
execution. 
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When FPop is issued and there is either a resource 
or an operand dependency between the present 
FPop and one or more of the previously fetched 
instructions. 

When a branch on floating-point condition (FBfcc) 
starts executing white the floating-point conditions 



are not ready. This occurs when one of the previ- 
ously fetched instructions is a floatingpoint com- 
pare (FCMP) that the FPU has not yet completed. 

Figure 1 7 shows the timing for a floating-point 
operation. 
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Figure 17. Floating-Point Operations Timing 
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Because the L848G1 chip set is a bus sieve, bus 
arbitrat ion must be performed externally using the 
BHOLO and LOCK pins. The L64801 iU asserts 
LOCK wtiai it needs to retajn th e bus. External 
hardware should assert BHOLO when it needs to 
keep the L64801 from using the bus. 



When BHOLO is asserted. It stops the processor's 
pipeline until it is disasserted. The signals DOE and 
AOE can be used to turn off the output drivers of 
the data bus, the address bus and the other contro 
signals. TMs allows these to be driven by external 
hardware. Rgure 18 shows the bus arbitration 
timing. 
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Figure 18. Bus Arbitration Timing 
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D.16 


75 


D.0 


115 


D.31 


76 


DhhlCH 


116 


D.8 


77 


DOE_ 


117 


D.4 


78 


mL.o 


118 


VSS 


79 


NULL.CYC 


119 


D.12 


80 


FEND 


120 


F.30 



No. 


Signal 


121 


F.22 


122 


F.18 


123 


F.26 


124 


D.14 


125 


D.IO 


126 


F.IO 


127 


F.2 


128 


F.6 


129 


F.ll 


130 


F.27 


131 


F.23 


132 


F.19 


133 


F.15 


134 


FCC.l 


135 


VDD 


136 


F.7 


137 


F.3 


138 


F.16 


139 


F.28 


140 


CLK 


141 


VSS 


142 


F.20 


143 


F.O 


144 


F.24 


145 


FCC.O 


146 


F.12 


147 


VDD 


148 


F.8 


149 


F.4 


150 


F.29 


151 


F.21 


152 


F.3 1 


153 


F.17 


154 


F.25 


155 


R13 


156 


VSS 


157 


F.9 


158 


F.5 


159 


F.i 


160 


A.I 



A2 

110. or PIN5 - '10 



-1 
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The Abacus Jl'^is a singie-chip 
floating-point coprocessor for the 
Fujitsu S-2(]i^B implementation 
of the SPARC architecture. It incor- 
porates a floating-point datapath 
and a floatine-point controller. The 
Abacus 3172 provides direct inter- 
face to the integer unit and memory. 
It is available in speed grades of 2G 
JM^MHz. 

Related product: The Abacus 3l7l 
single<hip floating-point coproces- 
sor for Cypress 7C60! implementa- 
tion of SPARC architecture- 
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Features 



SINGLE-CHIP 64-BIT FLOATING-POINT DATA 
PATH AND CONTROLLER 

6 4 -bit multiplier and divide/square root unit 
64-bit ALU 

16x64 or 32x32 three-port register file with an 
independent load/store port 



DIRECT INTERFACE TO FUJITSU S-2 
LSI LOGIC L64801 SPARC PROCESSORS 



AND 



DIRECT INTERFACE TO MEMORY 

FULL COMPLIANCE WITH ANSI/IEEE-754 
STANDARD FOR BINARY FLOATING-POINT 
ARITHMETIC 

143-PIN PGA PACKAGE 

LOW-POWER CMOS 



Description 

The Abacus 31'^is a high-performance, single-chip 
floating-point coprocessor for the Fujitsu S-20 and 
JBS5/LSI Logic L64801 implementation of the SPARC 
architecture. It incorporates a floating-point datapath 
and a floating-point controller. The Abacus 3 17J^ pro- 
vides direct interface to the integer unit and memory. It 
is available in speed grades of 20 and t^jS^z. 



The floating-point datapath circuitry contains a 64-bit 
multiplier, a 64-bit ALU, a 64-bit divide/square root 
unit, and a 16-word by 64-bit (or 32-word by 32-bit) 
three-port register file. 

The floating-point controller circuitry handles IEEE ex- 
ceptions and the interface between the floating-point da- 
tapath and the integer unit, as well as between the data- 
path and memory. 

CONFORMANCE TO SPARC ARCHITECTURE 

The Abacus 3 lifilprocesses instructions within the spec- 
ifications of the SPARC architecture as described in the 
SPARC Architecture Manual, by Sun Microsystems. 

DATA TYPES 

The SPARC architecture specifies four data types that 
can be used in conjunction with the floating-point unit 
(FPU): 

o 32-bit two's complement integer 
o single-precision floating-point 
o double-precision floating-point 
o^ extended-precision floating-point 



The Abacus 317SL§upports all of these data types except 
extended-precision. Any operation specifying extended- 
precision data types will be trapped to system software, 
with uriimplemented instruction trap type. 

INSTRUCTION PROCESSING 

When the integer unit (lU) decodes a floating-point op- 
erate (FPop) or a floating-point load/store (FPLd/St) 
instruction, it sends the instruction to the FPU over the 
F bus during the Execute stage of the lU pipeline. 

Dtiring the Write stage of the lU pipeline, the lU senr" 
the FPop address over the F bus to the FPU so that it m. 
be available for floating-point exce^Hion handling. Also 
during this cycle, the FPU will assert FHOLD- if a depen- 
dency exists. FHOLD- will remain assened until the de- 
pendency has been resolved. 

CONFORMANCE TO ANSI/IEEE-754 
SPECIFICATION FOR BINARY FLOAT! NG-POI.NT 
ARITHMETIC 

The Abacus 31 "^conforms to the requirements of the 
ANSI/IEEE-754 specification. 

FLOATING-POINT STATE REGISTER (FSR) 

The SPARC Architecture Manual contains detailed in- 
formation about the Floating-Point State Register 
(FSR). Bits 19: 17 of the FSR comprise the version field 
The version field specifies the particular floating-point 
unit/controller implementation. In the case of the 3S?JJ, 
FSR (19:17) = 0112. -3.7:i> 



Description, continued 

IMPLEMENTED INSTRUCTIONS 

Operations involving NaNs and denormaiized numbers 
require system software assistance or intervention. 
They terminate with trap type unfinished. 



Mnemonic (s) 




Operation 


Idf 




Load floating-point register 


iddf 




Load double floating-point register 


idfsr 




Load floating-point status register 


stf 




Store floating-point register 


stdf 




Store double floating-point register 


stfsr 




Store floating-point status register 


stdfq 




Store double floating-point queue 


fitos 


fitod 


convert integer to floating-point (rounded as per fsr.rd) (single/double) 


fstoi 


fdtoi 


convert floating-point to integer (rounded toward zero) (single/double) 


fstod 


fdtos 


convert single to double/double to single floating-point 


fmovs 




register to register move i 


fnegs 




register to register move with sign bit inverted 


fabss 




register to register move with sign bit set to 


fsqrts 


fsqrtd 


floating-point square root (single/double) 


fadds 


faddd 


floating-point add (single/double) 


fsubs 


fsubd 


floating-point subtract (single/double) 


fmuis 


fmuld 


floating-point multiply (single/double) 


fdivs 


fdivd 


floating-point divide (single/double) 


fcmps 


fcmpd 


floating-point compare (single/double) 


fcmpes 


fcmped 


floating-point compare and exception if unordered (single/double) 



Figure 1. Implemented instructions 



UNIMPLEMENTED INSTRUCTIONS 


Mnemonic (s) 




Oceratign ! 


fitox 




convert integer to extended floating-point (rounded as per fsr.rd) \ 


fxtoi 




convert extended floating-point to integer (rounded toward zero) < 


fktos 


fxtod 


convert extended floating-point to single/double floating-point , 


fstox 


fdtox 


convert single/double floating-point to extended floating-point j 


fsqrtx 




floating-point square root (extended-precision) 


faddx 




floating-point add (extended-precision) ! 


fsubx 




floating-point subtract (extended-precision) 


fmulx 




floating-point multiply (extended-precision) i 


fdivx 




floating-point divide (extended-precision) 


fcmpx , 




floating-point compare (extended-precision) 


fcmpex 




floating-point compare and exception if unordered (extended-orec s c^ 


fsmuld 




single product to double 


fdmulx 




double product to extended 



Figure 2. Unimplemenied insiruciions 
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Description, continued 

DEVICE DESCRIPTION 




D (same as Input D) 



RESULT .64 



Figure 3. Conceptual block diagram 



Description, continued 



F 



^-'32 



-/- 



.32 



)>DDA 



H 



32 



D1 



^ QAO I 12 



Addrass 

QlMU* 

0«pth 3 2 



E 



DW1 



LOAD 



,32 



,32 



QO 



^ QAN I 12 



instruction 

Quau* 
0«pth = 2 



ON 



i-JL 



£ 



Oms 



MUX 



MUX 



h 



>DMS 



CONTROL 



U IP 



MUX 



■32,^32 



MUX 



STATUS 



MUX 



I^ 



^^32/32 



MUX 



LAT2 



•32 



LAT2 



/32 

1 



CMS 

16X32 
RFiLE MS 

AMS BMS 



^-'32 



FSR 



'! ]' 



"T^ 



CLS 

16X32 
RFiLE LS 

ALS 8LS 



^32 



MUX 



,S4 



STORE 



/32 



IZ^ 



i 1 




32 



''32 



MUX 



MUX 



,64 



^''32 



MUX 



/^ 



MUX 



MUX 



^ 



'32 



D (same as input D) 




''64 



i 



> 



MB 



\X 



Multiply. Divide. Compare 
Square Root 



RESULT 



Figure 4. Simplified block diagram 



ABACUS ^aS 3172. 
FLOATING-POINT 
COPROCESSOR FOR 
SPARC 

PRELIMINARY DATA 
August 1989 



Description, continued 



INTEGER UNIT INTERFACE 



FP- 
FCC 

FCCV 



FHOLD— i 
FEXC- -^ 



FXACK 

FINS 

FADR 

FEND 

FLUSH 

F BUS 



32. 



^ 



MEMORY/SYSTEM INTERFACE 
< ( /^ ^ DBUS 



qrrx 
Abacus^^SSft 

SPARC 

Floating-Point 

Unit 



A A 4 



/ 



26 



VCC 



/ 30 



GND CLK 



DOE- 

MHOLDA- 

MHOLDB- 

MHOLDC- 

SHOLD- 

BHOLD- 

MDS- 

RESET- 



Figure 5. Abacus 39^0 signals , 



uescription, continued 



SIGNAL DESCRIPTION 

Signals marked with a minus sign (-) after their names 
are active low; all other signals are active high. 

INTEGER UNIT INTERFACE SIGNALS 

FP- OUTPUT 

Floating-point unit present. The FP- signal indicates 
whether a fioating-point unit FPU is present in the sys- 
tem. In the absence of an FPU the FP- signal is pulled 
up to VOC by a resistor. When an FPU is present the 
FP- signal is grounded. 

FCC OUTPUT 

Floating-point condition code. The FCC^.q ^i^ repre- 
sent the current condition code of the FPU. They are 
valid only if FCCV is asserted. 

FBfcc instructions use these bits during the execute 
cycle if they are valid, and delay the execute cycle if they 
are not valid. The condition codes are shown below. 



FCC (1) 



FCC (0) 












1 


1 





1 


1 



CONDITION 
Equal 

Opi < Op2 
Opi > Op2 
Unordered 



Figure 6. 

FCCV OUTPUT 

Floating-point condition code valid. The FPU asserts 
the FCCV signal when FCC bits represent a valid condi- 
tion. The FPU deasserts FCCV if pending floating-point 
compare instructions exist in the floating-point queue. 
FCCV is reasserted when the compare instruction is 
completed and FCC bits are valid. 

FHOLD- OUTPUT 

Floating-point hold. The FHOLD- signal is asserted by 
the FPU if it cannot continue execution due to a re- 
source or operand dependency. The FPU checks for all 
dependencies in the write stage and, if necessary, asserts 
FHOLD- in the same cycle. TTie FHOLD- signal is used 
by the lU to freeze its pipeline in the next cycle. The 
FPU must eventually deassert FHOLD- to release the 
lU's pipeline. 



FEXC- OUTPUT 

Floating-point exception. The FEXC- signal is asserted 
if a floating-point exception has occured. It remains 
asserted until the lU acknowledges that it has taken a 
trap by asserting FXACK. Floating-point exceptions are 
taken only during the execution of a floating-point 
instruction, FBfcc instruction, or floating-point load or 
store instructions. When the FPU receives an asserted 
level of the FXACK signal it deasserts FEXC-. 

FXACK INPUT 

Floating-point exception acknowledge. The FXACK sig- 
nal is asserted by the lU to acknowledge to the FPU that 
the current FEXC- trap is taken. 

FINS INPUT 

Floating-point instruction. The lU asserts FINS during 
the cycle in which F3-| g carries a valid floating-point 
instruction. The FPU uses this signal to latch the instruc- 
tion into its instruction register. 

FADR INPUT 

Floating-point address. The lU asserts FADR during the 
cycle in which F31 q carries a valid floating-pomt in- 
struction address. The FPU uses this signal to latch the 
instruction into its address register. 

FEND INPUT 

End floating-point instruction. The lU asserts FEND 
during the last cycle of a floating-point instruction in the 
lU pipeline. The FPU uses FEND to synchronize the in- 
strucuon/address in it execution pipeline with the lU 
pipeline. 

FLUSH INPUT 

Floating-point instruction flush. The FLUSH signal is as- 
sened by the lU to signal to the FPU to flush the instruc- 
tions in its instruction registers. This may happen when a 
trap is taken by the lU. The lU will restart the flushed 
instructions after returning from the trap. FLUSH has no 
effect on instructions in the floadng-point queue. 

F BUS INPUT 

Floating-point bus. F31 g is a dedicated 32-bit bus that 
receives floating-point instructions and addresses from 
the lU. Each floating-point instruction must use this bus 
for two cycles. Tne first cycle carries the instruction and 
the second its address. 
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Description, continued 

SYSTEM/MEMORY INTERFACE SIGNALS 

D BUS INPUT/OUTPUT 

Data bus. The Ds^ , .0 bus is driven by the FPU only dur- 
ing the execution of floating-point store instructions. 
The alignment for load and store instructions is done in 
the FPU. A double word is aligned on an 8-byte bound- 
ary, a word is aligned on a 4-byte boundary. 

DOE- INPUT 

Data output enable. The DOE- signal is connected di- 
recdy to the data output drivers and must be asserted 
during normal operation. Deassertion of this signal tri- 
states all output drivers on the data bus. This signal 
should be deasserted only when the bus is granted to 
another bus master, i.e., when either BHOLD-, 
MHOLDA-, or MHOLDB-. MHOLDC- or SHOLD- is as- 
serted. 

MHOLDA-. MHOLDB-, MHOLDC-. SHOLD- 
INPUTS 

Memory hold. Asserting either MHOLDA-, MHOLDB-, 
MHOLDC- , or SHOLD- freezes the FPU pipeline. 

BHOLD- INPUT 

Bus hold. The BHOLD- signal is asserted by the system's 
I/O controller when an external bus master requests the 
data bus. Assertion of this signal will freeze the FPU 
pipeline. 



MDS- INPUT 

Memory data strobe. The MDS- signal is used to load 
data into the FPU when the internal FPU clock is 
stopped while on hold. 

RESET- INPUT 

Reset. Asserting the RESET- signal resets the pipeline 
and sets the writable fields of the floating-point status 
register (FSR) to zero. The RESET- signal must remain 
asserted for a minimum of eight cycles. After a reset, ihe 
lU will start fetching from address 0. 

CLK INPUT 

Clock. CLK is used for clocking the FPU. It is high dur- 
ing the first half of the processor cycle and low during 
the second half. The rising edge of CLK defines the be- 
ginning of each pipeline stage in the FPU. 

vcc 

Power supply. All VCC pins must be connected to 5.0 
volt power supply. 

GND 

System ground. All GND pins must be connected to sys- 
tem ground. 

NC 

No connection. All no-connect pins must remain uncon- 
nected. 



Description, continued 

SYSTEM CONSIDERATIONS 

INSTRUCTION CYCLE COUNTS 

The 3 17^ has the following datapath instruction cycle 
counts. In order to arrive at regisier-to-register cycle 
counts, one cycle must be added to each number below. 



Mnemonic fs) 


Ooeration 


1 

QlQQk 

Cvcies 


fmovs 
fnegs 
fabss 


move 
negate 
absolute value 


1 H 


fadds. fsubs 
faddd, fsubd 


add/subtract single 
add/subtract double 




fmuls 
fmuld 


multiply single 
multiply double 




fcmps 
fcmpd 
fcmpes 

fcmped 


compare single 
compare double 
compare single 
and exception if 
unordered 
compare double 
and exception if 
unordered 




fitos 
fitod 


convert integer 
to single 
convert integer 
to double 


s S 


fstod 
fdtos 


convert single 
to double 
convert double 
to single 




fdivs 
^divd 


divide single 
divide double 


S0f 6>S 


fsqrts 
fsqrtd 


square root single 
square root double 






fPACK BENCHMARK ESTIMATE 

TheVode shown below represents the iftner loop of the 
SAXPY subroutine of the LINPACKAjenchmark. This 
loop requires 60 cycles on the Abacus 3170, .At 
25 MHA this translates into a pe/ik performance of 
3.33 Mf\oPS. 



loop tipp: 


1 


i3ti 


[dx+0]tdxO 


- fmdld 


dxO , da/, dxO 


ldd\ 


[dy+Oi ,dyO 


ldd\ 


[dx+sj .dxl 


addc(^ 


n.-4iri 


fadddX 


dxO.flyO.dyO 


fmuld \ 


dxliia,dxl 


Idd ' 


^ [dy/8].dyl 


Idd 


\ [d3^16].dx2 


add 


\ dxVsa.dx 


faddd 


\dA , dyl . dyl 


fmuld 


W2,da.dx2 


Idd 


Bfiy+iei ,dy2 


Idd 


Aix-8],dx3 i 


add 


flA32,dy 1 


faddd 


/dxi, dy2 , dy2 


fmuld 


/ dx3\ da . dx3 


Idd 


/ [dy-^],dy3 


std , 


/ dyO,fdy-32] 


std / 


dyl, [ay-24] 


faddd / 


dx3 , dyQ . dy3 


std / 


dy2, [djt-16] 


bg / 


loop tdp i 


std / 


dy3,[dyV8] 



Figure 8. LINPPiCK benchmark cbde 



Figure 7. Implemented instructions 
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System Considerations 

INTERFACE TO lU AND MEMORY 
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Figure 9. Interface to integer unit and memory 



System Considerations, continued 

INSTRUCTION OPERATION 



^ 



CLK __ 
FINS 

F BUS 
D1 REG 
REGISTER READ 



/ 




\ 




FPOP VrtftrtrtVj^°°Pj 



( 
< 



REGISTERS AA. AB. MA. MB 



RESULT LATCH (LAT2) 



REGISTER WRITE 



FEXC- 



FPOP 



OPERANDS 




OPERANDS 



{ 



N+1 



N+2 



RESULT 



fRESULTV. 



\ 



Figure 10. Instruction operation 
f 
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Specifications 

ABSOLUTE MAXIMUM RATINGS 



Supply voltage 

Input voltage 

Output voltage 

Operating temperature range (Tcase) 

Storage temperature range 

Lead temperature (10 seconds) . . . . 
Junction temperature 



. . . -0.5 to 7.0 V 
. . . -0.5V to VCC 
. . . -0.5V to VCC 

OMo 85^ C 

-65" C to 150° C 

300° C 

155° C 



Figure 11. 

OPERATING CONDITIONS 



PARAMETER 


MIN 


MAX 


UNIT 


Vcc Supply voltage 
'oh High-level output current 
lot Low-level output current 
Tcase Operating case temperature 


4.75 



5.25 

-1.0 

4.0 

85 


V 

mA 
mA 
°C 



Figure 12. 

DC SPECIFICATIONS 



PARAMETER 


TEST CONDITIONS 


MIN 


MAX 


UNIT 


V^ High-level input voltage 


Vcc = l^'N 


2.1 




V 


V^ Low-level input voltage 


Vcc = M'N 




0.8 


V 


V|HC High-level input voltage 


Vcc = MIN 


2.4 




V 


^iLc Low-level input voltage 


Vcc = MIN 




0.8 


V 


Vqh High-level output voltage 


Vcc = MIN. loH= MAX 


2.4 




V 


Vql Low-level output voltage 


Vcc = MIN, Iql = MAX 




0.4 


V 


ly Input leakage current 


Vcc =MAX. V,^ =0 to Vcc 




±10 


mA 


1 Lo Output leakage current 
(output disabled) 


Vcc =MAX. V,^ =0 or Vcc 




±10 


mA 


0^ Input capacitance' 


Vcc = MAX. V,^ = to Vcc 




15 


PF 


CouT Output capacitance" 


Vcc =MAX. VcuT =0 to Vcc 




20 


PF 


'OcLK Clock Input capacitance" 


Vcc = MAX. V,N = to Vcc 




25 


pF 


Cj^gDOE- Input capacitance' 


Vcc = MAX. V,N = to Vcc 




30 


pF 


l^c Supply current 


Vcc = MAX. """cY = MIN: TTL inputs 






mA 


• Guaranteed, but not tested 



Figure 13. DC specifications 
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Specifications, continued 

AC SPECIFICATIONS AND TIMING DIAGRAMS 


SYMBOL 


DESCRIPTION 


Min/Max 


Reference 


20 MHz 


25(1V[HF 


ICY 


Clock Cycle Time 


MIN 




50 


4pX 


TCH 


Clock High time 


MIN 




15 


(12 


TCL 


Clock Low Time 


MIN 




15 


^ 


TR 


CLK Rise time 


MIN 




3 


^ 


TF 


CLK Fall time 


MIN 




3 





T1 


FINS Setup Time 


MIN 


CLK<- 


16 


^2 


T2 


FINS Hold Time 


MIN 


CLKi- 


4 


3^ 


T3 


F bus (Abus) Instruction Setup Time 


MIN 


CLK* 


6 


^ 


T4 


F bus (Abus) Instruction Hold Time 


MIN 


CLK+ 


6 


Cs 


T5 


FADR Setup Time 


MIN 


CLK+ 


16 


12^ 


T6 


FADR Hold Time 


MIN 


CLK+ 


4 


^ 


T7 


D bus Data Load Setup Time 


MIN 


CLK+ 


5 


^^ 


T8 


D bus Data Load Hold Time 


MIN 


CLK+ 


5 


X 


T9 


FEND Setup Time 


MIN 


CLK+ 


16 


'O 


T10 


FEND Hold Time 


MIN 


CLK+ 


4 


/^ 


Til 


D ^s Data Store Output Delay Time 


MAX 


CLK+ 


33 


^" 


T12 


D bus Data Store Output Valid Time 


MIN 


CLK+ 


6 




T13 


MHOLDA- Setup Time* 


MIN 


CLK-/+ 


6/25 


sn^ 


T!4 


MHOLDA- Hold Time* 


MIN 


CLK- 


6 


A 


T15 


FHOLD- Output Delay Time 


MAX 


CLK+ 


44 


(35 


T16 


FHOLD- Output Valid Time 


MIN 


CLK+ 


8 


^ 1 


T17 


MDS- Setup Time 


MIN 


CLK-/+ 


6/25 


6/20; i 


T18 


MDS- Hold Time 


MIN 


CLK- 


6 


y i 


T19 


FCCV Output Delay Time 


MAX 


CLK^- 


44 


(.. , 


T20 


FCCV Output Valid Time 


MIN 


CLK+ 


8 




T21 


FCC1..0 Output Delay Time 


MAX 


CLK+ 


44 


34 ) 


T22 


FCC1..0 Output Valid Time 


MIN 


CLK+ 


8 


7^ 


T23 


FLUSH Setup Time 


MIN 


CLK+ 


22 


f.. 


T24 


FLUSH Hold Time 


MIN 


CLK+ 


4 


X 


T25 


FXACK Setup Time 


MIN 


CLK+ 


16 


'O 


T26 


FXACK How Twne 


MIN 


CLK* 


4 


X 


' T27 


FEXC- Output Deiay Time 


MAX 


CLK+ 


30 


r.4 


T28 


FEXC- Output VaBd Time 


MIN 


CLK+ 


7 


^ 


T29 


RESET- Setup Time 


MIN 


CLK+ 


12 


y 


T30 


RESET- Hold Time 


MIN 


CLK+ 


5 


^ 


T31 •* 


D Bus Turn-off Time 


MIN/MAX 


DOE- 


6/33 


s,.^ 


T32 •• 


D Bus Tum-on Time 


MIN/MAX 


OOE- 


6/33 


f.. 


* Specifications for MHOLDB-. MHOLDC-. SHOLD-. and BHOLt 
** Guaranteed, but not tested 


D- are the same 









Figure 14. AC specifications 
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ABACUS 3i|S 3/70. 
FLOATING-POINT 
COPROCESSOR FOR 
SPARC 



PRELIMINARY DATA 

August 1989 



Specifications, continued 



CLK 



INPUT SETUP AND 
HOLD TIMES WITH 
RESPECT TO CLOCK 
RISING EDGE 



INPUT SETUP AND 
HOLD TIMES WITH 
RESPECT TO CLOCK 
FALLING EDGE 



OUTPUT VALID AND 
OUTPUT DEL^Y TIMES 



DOE- 



D BUS 
DATA STORE 







TcY 








TcL 


>l' 


TCH 

















<*" 



Ts I T, 




^S- Ti, ^"3- ^s- "T/. Tq. T,7, T23. T25. Tgfl. T,3 
Th: "^2- ^4. Tfi. Tfl. T10. T24. T26. T30 



Ts: T13, Ti7 
Th^ T14. Tie 



I DO 



^ 




"'"oo- "'"ii- TiS' "TtQ- "^ai. "^"27, 

"^VO- ^12' ^16' T"20- '''22> ^28 



ASYNCHRONOUS DATA OUTPUT ENABLE TIMING 




DATA AT THE 
PADS. BUT NOT 
DRIVEN OUT 
SINCE BUS O BUS 

NOT ENABLED STORE 

VALID 



MSH = Most significant half of a 64-bit word 
"LSH = Least significant half of a 64-bit word 



Figure 15. Timing diagrams 
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Specifications, continued 



CLK 



Signal 




Oeiay measurements 
are made with 
reference to 1 . 5V 
threshold 



Figure 16. Reference levels in delay measurements 



DOE- 



BUS 
OUTPUT 



3.5V 



2.0V 



0.4V^ 



T3, Bus turn-off time 




2.0V 



T-_ Bus turn-on time 



Qs2y HIGH 

f IMPEDANCE 



(2.0V) 




2.4V 



VALID 



0.8V 



Figure 17. Tri-state timing 
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Specifications, continued 

I/O CHARACTERISTICS 



2.0V 
o 


> 400 n 


( 


Output 


' ° pin 


-p 50 pF 



Figure 18. AC test load 
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Pin Configuration 



Pin A1 
Identifier 



4 5 6 



8 9 10 11 12 13 14 15 



M 



r 


022 


F22 


024 


(^4 


F2S 


02s 


F2S 


F27 


F2S 


F29 


F30 


F31 


D31 


NO 


021 


vcc 


VCC 


F23 


023 


VCC 


02s 


VCC 


027 


028 


029 


030 


VCC 


VCC 


VCC 


020 


F21 


GNO 


GNO 


VCC 


GNO 


GNO 


VCC 


GNO 


GNO 


GNO 


GNO 


GNO 


VCC 


FCCV 


019 


VCC 


GND 


15X15 143-PIN PGA 

TOP VIEW 
CAVITY DOWN 

31-72. 


GNO 


Vc«- 


FCC1 


F18 


F19 


F20 


VCC 


FCCO 


FXACK 


F16 


017 


018 


RESET- 


GNO 


FEXC- 


016 


F17 


GNO 


CLK 


GNO 


NC 


FO 


F1 


00 


GNO 


VCC 


FHOLD- 


01 


OOE- 


GNO 


VCC 


W1HLDA- 


BHOLD- 


02 


VCC 


GNO 


GNO 


MOS- 


V/IHLD8- 


F2 


03 


GNO 


FLUSH 


yiHLDC- 


SHOLD- 


F3 


VCC 


05 


GNO 


FAOfl 


FINS 


04 


VCC 


GNO 


GNO 


GNO 


06 


GNO 


010 


GNO 


GNO 


014 


GNO 


GNO 


VCC 


FEND 


F4 


VCC 


Vcc 


F6 


VCC 


F8 


VCC 


F11 


012 


VCC 


VCC 


VCC 


015 


VCC 


NO 


F5 


VCC 


06 


F7 


07 


F9 


09 


F10 


Oil 


F12 


F13 


013 


F14 


FIS 


FP- 



M 



8 9 10 11 12 13 14 15 



Note: NC = not connected: pins so marked must be left unconnected. 
There is no pin at A1 . A1 is a locator hole. 



Figure 19. 



RECEIVED 

NOV 2 9 iqRq 



NE/THW 



Physical Dimensions 



ABACUS 3170 
FLOATING-POINT 
COPROCESSOR FOR 
SPARC 

PRELIMINARY DATA 
August 1989 



143-PIN GRID ARRAY 



BOTTOM VIEW SIDE VIEW TOP VIEW 




A1 LOCATION 



Symbol 


DIMENSIONS 




INCHES 


MM 


A1 


O.IOOiO.010 


2.54 + .20 


A2 


0.180 typ. 


4.57 typ. 


A3 


0.050 typ. 


1.27 typ. 


D 


1.575 sq.+ 0.016 


40.0 +0.41 


El 


1.400 sq.+ 0.012 


35.56 +0.30 


E2 


0.050 dia. typ. 


1.27 dia. typ. 


E3 


0.018 +0.002 


.46+ 0.05 


d 


0.065 dia. typ. 


1.65 dia. typ. 


e 


0.100 typ. 


2.54 typ. 



Ordering Information 




Case Temperature Range 



Order Number 



3l7Q"0g0 - QCD 



Revision Summary 



The following changes have been made in this data sheet 
relative to the previous edition (May 1989). 



Change 



Page 



Instruction Cycle Counts section added 

FEXC- Output valid time changed from 8 to 7 ns for 20 MHz 




S4-Cache 

Preliminary 



D 



Features 



implements 64-256 KByte write-through instruction/Data cache with 16-byte line size 

Performs cache tag comparison 

Controls SBus reads and writes 

Automatically fills cache on cache misses 

Controls mastership of SBus for DMA 

Performs buffered writes with external write buffer 

Replaces cache tag read/write buffers 

Performs cache fiush comparisons 

Controls system-wide byte paddng 

Contains Sun-4 Virtual Address Error latches 

Maintains copy of 4-btt Sun-4 context register 

Contains Sun-4 System Enable Register 

Contains Sun-4 Bus Error Registers 

Monitors bus fqr unacknowledged transfers 

Generates system reset 



iu ah(31:18} 
iu~al(17:0) 
iu.asi(3:0) 
iu slz(1:0} 
iu~rd 



mmu typ(1:0) 
mmuTx.v.s.w) 
sb br(2:0) 
sb~ack(32.'B)_ 



lod(7:0) 




por_ 
iu error 



k -^ /Sy$'.j;Enai^ Reg^ 



^'vAddr^^TOTvltegi^ ,.ri ^ 






Context Reg«' 






-Reset 



r 



sb a(29:0) 
sbj5iz(2:0} 




lu.shold 

iujaoe_ 

lujmd^ 

iujmexc 

s61as_ ~ 

w6jce_ 

wb''oe~ 

cd"be~ 

sblbgr2:0)_ 

ctwe_en_ ~ 

cdwe'en" 



sb_ack(32.8), 
sb reset 



S4 -Cache 7/18/88 



Sun Confidential 



Page i 



Cach« Intttrfaco 


25 


ct_a(29:16) 


B047U 


ct.c(3;0) 


B04TU 


ctjs 


B04TU 


Ct V 


B04TU 


ct %va 


B04TU 


ctwejen_ 


BT4 


cdwejen^ 


BT4 


cd_oe_ 


BT4 


car en 


BT4 


Miseallaneous 


4 


%vb oe 


BT4 


wbjce_ 


BT4 


84cjoe_ 


UCHTNU 


84c_test_ 


IBl^^NU 



Cache Tao Address bUs 

Cache Tag Context bits 

Cache Tag Supervisor 

Cache Taq VaJkJ 

Cache Tag Write Allowed 

Cache Tag Write Enat^ie Enable. Goes to S4-Ctock. 

Cache Data Write Enable Enable. Goes to S4-Clock. 

Cache Data Output Enable. 

Cache Address Register Clock Enable. 

Write Buffer Output Enable 
Write Buffer Ck>ck Enable. 
S4-Cache chip output enable. 
S4<-Cache chip Test mode. 



Signals: 


144 


Device Type: 


LMAd284 


Package Type: 


PFP160 



(10:158 V0D:4 VSS:6) 
(PAOS:160 V00:7 VSS:9} 



Input/Output Buffer Definitions 

0RVC8 Input ctock driver 

IBUFNU Input buffer, CMOS level, inverting, internal pullup 

TLCHT iriput buffer, TTL level, nort-invertirig 

TLCHTU Input buffer. TTL level, non-inverting, interriai puUup 

TLCHTISIU Input buffer, TTL level, inverting 

BDtrnj Bidirectional buffer, TTL input levels, internal pullup, # indicates output drive 

60#TRU Bidirectional buffer, TTL input levels, internal puitup. slew-rate controlled 

output. # indicates output drive 

BT^ Tri-statable Output buffer, CMOS, # indicates output drive current. 
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Functional Description 

Cach« Ov9fvi«w 

The ca^^ impiementsd vdth t^«@ aid cf the S4-Caeh« enip is a %vrite>through mixed 
instruction/data cad^ with a 16~byte line size. A typical implementation is shown in the 
following diagram: 



4096 
UrvBS 



Cache Tags 



Cache Data 




' ■■' ■fe*s!w(wfi»wa*aSBf 




1 Tag 




16 bytes 



The cache tag and cache data memories are built using external generic static RAM chips. 
Although the programmer's model of the cache data RAM is 40S6 tines of 16 bytes, it is 
currently implemented with eight 16K x 4 static RAMs. 

The size of the cache may vary from 4096 lines deep to 16,384 lines deep. Larger 
implementations of the cache wOI connect the unused cache tag pins to the appropriate 
address bits latched in the cache address register. 
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S4-Cadne 



SBus Overview 

The S8us fundamental operation is shown in the tfagram below. The S8_AS. signal 
indicates the validity of SB.PA(28:00}« S8_R0. SB.S1Z(2:0) and the siQf^is derived 
combinatoriaOy from these signals. On the risktg clock edge at which AS. is sampled true, 
these signals will also be valid with the setup spedfied. The cyde will continue until an 
acknowledge is received from the accessed device. Wait states witi be inserted on the 
SBus until the acknovdedge is received. 



dk 



sb as. 



\Jn.Jn.Ji^J"\J^\Jh.J'ij^ 



sb a(29:0} ^^mmmismi^mMmi 
sbjM (28:13) 
st^jrd E 

sb.slz(2:0) 



I i»pitite>i !:«;:mw w: '$v iii ^ 



T*™?iS!5i58 ^»wSp SSS w^ 



i 



/ 



tKi if Sifmmm^ iif ^tm^ 






i tmmmmv>timmiS!^ 



sb_ack{32,8}_ •" 
sb.d(31:0){read} 



x=^ 



sb.d(31:0) {write} imrnmrnrnm^mT 



|-^iJMIi«!^;:<!$;Rf#H^^ 



The addresses, read, and size signals wOI te held valid unt9 the dock edge after the one 
on which the acknowledge is sampled true. See the tables below for acknowledge and 
size encoding. 

Shared contrd signals S8_ACK32_. SB_ACK8_, SB_ERR_. and SB.MERR. must foflow a 
special protood. which requires that the signal is taken out of tri-state mode, driven low 
for the desired number of docks, then driven high for one dock before being tri-stated 
again. See the SBus spedfication for further details. 



S4-Cache 7/18/88 



Sun Confidential 



Page 7 



Parity Errors 



Parity errors are reported by the S4-Buffer chip to the S4-Cache chip via S8_MERR_, The 
S4-Cache chip reports parity errors to the lU on HI cydes by asserting IU_M£XC_ as 
shown In the followirH} (iiagram. 



dk 

iujshold^ 

sb_as_ 

sb_a(29:0) 

8b_ack_ 

sb_d{31:0) 

lu.d(31:0) 

sb_merr_ 

lu^mexc^ 

iu mds 



li^S^??*W§ife^^j^iJjjijfe:Sj!^:$| 



t^^i?^^^^^^^^^^^gS^^^^^»'«i^^:?| 



■m \m ^^ 



ml V^ 



-SIM— ®Z& 



tijS^il 



• g tar 
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SBus Buff«r«d WrH«s 



The S4-Cache chip performs buffered writes to Type and Type 1 Spaces using the write 
buffer in the S4-6uff er chip. The iU b held startino when the miss is detected and ending 
when the MMU has been checked. This occurs kivisibiy to the S8us. where the buffered 
write is indistinguishat}le from a standard write. Write data is available on the SBus on the 
rising edge at which AS. is sampled true, and on the 100 bus one dock later. 



dk 
lu.shokj. 

8bjBS_ 

sb.a(29;0) 

sbjrd 

sb.s}z(1:0) 

user_ 

devspc. 

ctl(2:0) 

sbjBCk_ 

sb.d(31:0) 



"BL 



1^^^^^^^^ 



^^t^^^j^^^^r 



f 



!Ka*s»Ki 



P^^^^^^^^ 



m. 



s 



l^^^^^^^^M 



^m^m^ 



evJ^T"^^^ 



3i>. 



WB_CE_ Function 

The WBjCE_ signal goes to the S4-6uffer chip, wfiere it is used to generate the dock to 
the write txiffer as shown In the following diagram: 
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S4-Cache 

PraOminafy 



Dynamic Bus Sizing 

Byt€ Packing 

To execute code contained in 8-b(t devices on either the S8us or the 100 bus. the 
S4jCache chip must pack the bytes up to fit the word length of the SPARC chip, as 
instruction fetches assume this data width. The S4-Cache chip transforms the SPARC 
data bus into a dynamicany-sized bus somewhat Gke that of the Motorola 68020. The 
number of bytes involved in the first cyde Is encoded on the three SBJSIZ signals. The 
current slave device responds with Its port width encoded on the two S8_ACK signals. An 
lU word-length access %¥ill be converted into the appropriate number of shorter accesses 
If the accessed device indicates its port %vidth is less than 32 bits. 

Transfer Size Encoding 



sb siz2 


sb sizi 


sb_sizO 


Transfer Size 











4 Bytes 








1 


1 Byte 





1 





2 Bytes 





1 


1 


Not used 


1 








16-8yte Burst 


1 





1 


Not Used 


1 


1 





Not Used 


1 


1 


1 


Not Used 



Although the SBus specification allows 3-byte operations, none will be generated by the 
S4-Cache chip because an SPARC transfers are aTtgned. 
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DMA Cycles 



Bu^ Arbitration 



The S4-Cache chip roceives three levels of DMA bus request {SB_6R(2:0)-} and 
Generates three correspondino levels of bus grants {SB_BG(2:0)-}. In case more than 
one bus request is received simultaneously, the bus request priorttles are as foOows: 



lU Write Hits 
SB.BRO. 
SB.BRl. 
SB„BR2_ 
nj Misses 



Highest Priority 



Lowest Priority 



If a bus request is pending at the end of a OfM cycle, the bus art>iter will use a 
round-robin bus grant scheme so ttiat aU DMA masters can share equal bus bandwidth. 



Rerun Cycles 

The S4-Cache chip implements a renin protocol that causes the current SBus cycle to be 
aborted and restarted later. This allows resolution of deadlocks between the HJ and DMA. 
and allows SBus slaves to have long read latency without locking out DMA. 



dk 

sb_as_ 

sb_ack_ 

sb_err_ 

sb_br_ 

sbj3g_ 



9 



HiL 



rar 



M 



^^a. 



f^ 



JIT 



TM 



y 



sb_a(29:0) 

sb_rd 
sb_si2(2:0) 



1^ V 



zxz 



2XZ 



T— V f 1 



j^ 



2Z3- 



T>- 



Deadlocks can occur when a single functional module is capable of being both a SBus 
slave and a DMA master. Such a module typically selects either its master or slave mode. 
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S4-Cache 

Prelinnlnafy 



] 



Cache Fills 

The cache is filled under the foOowing conditions: 
Bead cyde & 
Device space & 

Page is marked cacheabie <1MMU_X) & 
EHJ^ACHE bit in System Enable Register is set & 
No protection error is detected. 

A cache fiH cycle consists of four 32-b(t reads of main memory. As the cache controller is 
capable of accepting an acknowledge on every dock, the four reads will typically be dor>e 
using a high-speed burst mode access of the main RAMs. After the first acknowledge the 
bus controller %vin strobe the data Into the lU. making the assumption that the memory 
provides the requested word first rather than providing the first word in the fine. 



Cache Fill with NOn-Continuous ACKs 



how I E 



8b_as_ LiJ. 



sb.a CS-EI 



sb_rd r 



sb_si2(2:0) 



devspc_ OL 
sb ack32 



sb_d(31;0) 
iu_mds_ 
cd we 



ct we 



JET 



I \ 1^1 



16 BYTES 



T~n II I I I I I I rr 



<3nD-<^IZX3ZTX2ir> 



T-i rr 



"l_r 



i_r 



T-J 
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S4-Cache 

Pr>fffnlnBfy 



] 



Cache Hits 

A cache hit occurs under the following conditJons: 
Device space & 

CT.V high (cache tag is vaHd) & 
Ct]a(29:16) « latched IU_A(29:16) & 
IuJa(31) « IU_A(30) «« Iu1a(29) & 

{CT.S & Supervisor cycle} OR {ICT_S & CTjC(3:0) -= 00(3:0)} & 
{lU.RO OR (CT.WA & IStore dout>ie & SBus Idle)} 

Cache Read Hit 



dk 

lu_a(31:0) 

car^dk 

car_a(17:0) 

lujd(31:0) 



ED 



dil 



11= 



-•i^iiix->:f^<<<<<X:X-^^«<-^:<-K'y^^ 



TTT 



Cache Write Hit 



elk 

iu_a(31:0) 

wjwe_ 

carjcOc 

car.a(17:0) 

cd we 



LZI 



Em: 



l^T 



U± 
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S4-Cache 

Prelknhiafy 



Cach« Flush SatisfyiriQ Match Cnteria 



€(k 



iu shokj 



iu rd 



ctwe_en. 
ct we_ 



Cache Rush Not Satisfying Match Criteria 



iu^shold 
iu rd 



ctwe_en_ 
ct we 
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Address Map 



Device Space mnd Control Space 

The SPARC address space identiriers are divided into two "spaces* according to the 
following table: The signal OEVSPC^ chooses between device space and control space 
address maps. Device space devices are accessed with physical addresses provided by 
the MMU. while control space devices are accessed with virtual addresses on the SBus. 



ASl 


Function 


Space 


0-1 


Reserved 


Control 


2 


lU Extensions 


Control 


3 


Segnient Map 


Control 


4 


Page Map 


Control 


5-7 


Reserved 


Control 


8 


User Instruction 


Device 


9 


Supervisor Instr. 


Device 


A 


User Data 


Device 


B 


Supervisor Data 


Device 


C 


Segment Flush 


Control 





Page Flush 


Control 


E 


Context Fhjsh 


Control 


F 


Reserved 


Control 
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S4-Cache 

Preliminary 



Registers 

Shadow Context Register 

The Shadow ConteJa Register makiti^ns a copy of uie Context Register that is found ki the 
S4<-MMU chip, ft is tised Intemaliy to the S4-Cache chip in the cache hit comparator, the 
cache flush comparator, and the cache tag write data. It is cleared on SB_RESET_ and 
written simulteneously %vith the Context Register In the S4_MMU chip, it can be read only 
with 8-bit operations on an odd-byte location. The bits are assigned as follows: 



Write: 



0(31:28) 
0(27:24) 



Unused 
00(3:0) 



Read 0(23:20} Unused 

0(19:16) 00(3:0) 



Read back as zeroes 
Write Only 

Read back as zeroes 
Read Only 



System Enable Register 

The System Enable Register enables various system functions and allows t^ooting. This 
register can be read and written under software control, but can only t>e accessed with 
8-bit operations. Ail bits are initiaHzed to zero by SB_RESET_. Bits are assigned as 
follows: 

Enable Boot State 

Enable Oirect Virtual Memory Access 
Enable Cache Fais & Hits 

Software Reset. 

Reads back as zero. Write has no effect. 

EN_BOOT_. Boot state (active low) forces all supervisor program fetches to the EF'ROM 

device independent of the setting of t^te memory management. All other types of 

references are uriaffected and %vill be mapped as during rtormal operation of the 

processor. 

ENJ3VMA. This bit enables an OVMA. inckiding on-board and off-board. 

ENjCACHE. When this bit iS' cleared, no cache fills win be perfomied and all lU reads will 

miss. 

SWRESET. A low-to-high transition on this bit will generate a S8_RESET_. 

t 
BUS ERROR REGISTERS 



0(31) 


EN_BOOT_ 


0(30) 


Unused 


0(29) 


EN.OVMA 


0(28) 


ENJ^kCHE 


0(27) 


Reserved 


0(26) 


SWRESET 


0(25) 


Reserved 


0(24) 


Reserved 



Four bus error registers are contained in the S4-Cache chip, located at the following 



addresses: 



0x6000 0000 
0x6000 0004 
0x6000 0008 
0x6000 OOOC 



Synchronous Error Register 
Synhronous Error Virtual Address Register 
Asynchronous Error Register 
Asynchronous Error Viaual Address Register 
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S4-Cache 

Pr elk n inafy 



Cache Tags 

The cache tags are directly readable and writable in control space. Write cycles must be 
performed with 32-bft accesses only. Other widths during writes wiii cause a Size En-or 
Memory Exception because the S4-Cache chip includes a byte packing register that 
demultiplexes the 8-bit 100 bus up to the 32-bH cache tag bus, and it can only operate 
four bytes at at tinne. The cache tags are not Initiafized in hardware, and so zeroes must 
be written to all CTJV bits before the cache is enabled* Cache tag direct reads make use 
of the standard byte-packing feature of the S4 chip set described earlier. The following 
diagram shows tfw operation of the cache tag byte packing register In the S4-Cache chip 
on a cache tag direct write: 



dk 



1 9 ^ £ ^ ft T ft O in 11 19 ^_ 



iu shoiL 



sb asT 



^ra 



i^iii tj^ 



^m 



sb_rd L 



devspcl. 



kxj(7:0) 1^1 

sb ackS xr^^JB 



mr 



\mf 



MC 



I'-igM 



" Bi tur 



"BL 



^^m t^?l 



sb.a(1 ^^ ^ »-^-^-« 



m 



^a. 



ML 



im 



ct we 



The fonnat of the cache tags Is as foHows: 



0(31:26) 




Unused 


0(2S:22) 


CTJD(3:0) 


Cache Tag Context bits 


0(21) 


CT WA 


Wrfte AOowed 


0(20) 


CTJS 


Supervisor-only access protection bit 


0(19) 


CT V 


Cache Tag Valid 


0(18:16) 




Unused 


0(15:2) 


CT_A(29:16) 


Virtual address bits A (29: 16) 
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Timing Specifications 



Output Oeiavs 



Conditions: VCC=4.75 to 5.25V, TA=0 to +70C. Output Load=15 pF 
Symbol From To min max unit 



t1 



elk high dk high 



50 



t2 


dk 


hijsi 


t3 


dk 


iujaoe_ 


t4 


dk 


iu_mds_ 


ts 


dk 


iu_mexc_ 


t6 


dk 


lu_mhoWI_ 


t7 


dk 


sb_a {untranslO 


t8 


dk 


sb_a (seg. map) 


t9 


dk 


sblack32_ 


tio 


dk 


sb_ack8_ 


til 


dk 


SbjBS_ 


t12 


dk 


sb_bg_ 


t13 


dk 


sb_eiT_ 


t14 


dk 


sb_meiT_ 


t15 


dk 


sb^fd 


t16 


dk 


sb_reset_ 


t17 


elk 


sb_si2 


t18 


dk 


car en 


t19 


dk 


ed_oe_ 


t20 


sb_rd_ 


cd_oe_ 


t21 


iu_rd_ 


ed_oe_ 


t22 


dk 


cdwejen_ 


t23 


dk 


ctja 


t24 


dk 


ct_c 


t25 


dk 


ctjs 


t26 


dk 


ct_v 


t27 


sb_rd 


ct_v 


M28 


dk 


ct_wa 


t29 


elk 


dl 


t30 


dk 


ctwe_en_ 


t31 


dk 


devspc_ 


t32 


dk 


io_d 


133 


sb_a 


io_d 


t34 


dk 


user__ 


t35 • 


dk 


wb_ce_ 


t36 


dk 


wb oe 



ns 



34.4 


rts 


17.8 


ns 


17.1 


ns 


17.4 


ns 


16.3 


ns 


41.8 


ns 


30.3 


ns 


24.4 


ns 


24.4 


ns 


27.7 


ns 


19.8 


ns 


26.1 


ns 


24.7 


ns 


29.1 


ns 


23.0 


ns 


36.9 


ns 


16.7 


ns 


22.0 


ns 


16.1 


ns 


11.8 


ns 


15.6 


ns 


33.7 


ns 


24.9 


ns 


24.7 


ns 


24.8 


ns 


12.5 


ns 


24.7 


ns 


16.0 


ns 


18.2 


ns 


16.8 


ns 


74.5 


ns 


41.0 


ns 


17.1 


ns 


16.0 


ns 


16.4 


ns 
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S4-Cache 

Preliminary 



Change History 



2/1/88 



Sunray support — 
Hardware Cache Consistency- 
S8_ACK @ State 4— 
Cache filling — 

Cache Hit definition — 
Context Flush criteria — 
Table of Contents — 
Timing Specifications — 



Removed. 

Removed. 

Removed restriction of no ACKs before state 5. 

Removed restriction to Type Space. Added 

requirement of 1MMU_X. 

Added term for write hits. 

Fixed bug in CTjS polarity. 

Added. 

Added a few. 



7/18/88 



Cleaned up errors everywhere. 
Timing- 
Reruns — 
Cache Hits — 
Cache Flushing — 

Bus Error Registers — 
Cache Data — 



Added many new tHTung specs. 
Used post-route timings. 

S8_AS_ is negated one dock later than prev. spec. 
Changed definition of cache hit on page 19. 
Removed notes about flushes before changing MMU. 
Modified timing diagrams; 1U_SH0LJD_ for 2 clocks. 
Added SER. SEVAR. AER. AEVAR definitions. 
Added restriction: no write after control space read. 



Errata 



7/18/88 



DMA Timeouts: 



Tinrteouts that terminate DMA cycles wiH cause the TO_ERR bit in the Synchronous Error 
Register will be set incorrectly. 
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S4-Buffer 

Praliminafy 



Features 



Generates and checks parity on main memory accesses 

Perfomis buffered write cycles in conjunction with the S4<-Cache chip 

Myttiplexes 32-bft iU data bus down to 8<-bit 10 data bus on %vrite cycies 

Oemuttipiexes and latches 8-bit 10 data bus up to 32-bit IU data bus on read cydes 

Contains byte-packing registers for dynamically sized reads from SchodBus data bus 

Contains Sun-4 Parity Control Register 

Cont^ns 7-bft open-drain genera! purpose 1/0 register (PIO) 

Forces No Op on memory exceptk>ns 



lu_d(31:0) 



%vb_dk 
wb oe 



par_en_ 
par_cs_ 



sb_rd — 

iod_en_ 
iu_mexc_ 

pio_se!_ 




^^^B^^^^^K 






sb_d{31:0) 



par (3:0) 
sb err 



k5d(7:0) 



-p«o(6:0) 
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S4-Buffer 

Preliminary 



IBUFN Input buffer. CMOS, inverting 

I6UFNU Input buffer. CMOS, inverting, internal puilup 

TLCHT Input buffer. TTL. non-inverting 

TLCHTN Input buffer. TTL. inverting 

BO^TTRU Bidirectionaf buffer* TTL input levels, # Indicates output drive, intemal puliup 

BTif Tri-statable output buffer, CMOS, # Indicates output drive current. 

B04T00 Open drain buffer. TTL, npn-^nverting. 
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S4-Buffer 

PrsSmlnary 



] 



Port Location 

The location of 8. 16 and 32-bft ports on the 32-b(t SehooSus data bus is defined as 

follows: 



sb^d(31:24) 


3b„d(23:16) 


5b.d(i5:a) 


sb_d(7:o) 


8-bit port 








16-bit port 






32-brt port 1 



S6_0 Read Data Latching 



8-Bit 

Port 

Ack. 

16-Bit 

Port 

Ack. 


IU_0 






SB 




lu.d(31:24) 


. •^ ' 


8bjJ(31:24) 




•-1 




kj.d(23:16) 


8b„d(23:16) 


«< 


lu.d(15:8) 


sb_d(15:8) 


•^ 


lu_d(7;0) 


sb.d(7:0) 














iu.d(31:16) 


a-O 


sbjd(31:16) 




«< 




iu.d(15:0) 


sb_d(15:0) 






- 
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S4-Buffer 

ProSminary 



] 



Parity Ovecktng 

Parity is checked on read cycles during %vhich PAR_EN_ is active and the Parity Check bit 
is set in the Parity Control Register (See below for a descriptk>n of the Parity Control 
Register). Parity errors are reported by asserting S8_ERR_ for one dock period, and 
setting the bits In the parity control register corresponding to the bytes in which parity 
errors were detected. S8_ERR_ wiH cause the S4-Cache chip to assert tU.MEXC.. 
causing the HJ to take a nnen>ory exception trap. Parity checking is even, meaning a byte 
of ones requires a zero parity bit, so that a data and parity bus floating high %vUI cause a 
parity error. 

Parity errors are reported on HI cydes by a one-dock low pulse on the SB_ERR_ signal, 
two clocks after S8_ACK32.. as shown In the foflowtng diagram: 



dk 

kj shoid 



I:^icj^S^:Hijc::j| 



sb_ack32, 
sb err 



j^S" 



>- n. 



^iSig^^jaj^l 



Parity errors are reported on OVMA cydes by a one-dock low pulse on the S8_ERR_ 
signal, one dock after S8_ACI02., as sho%vn In the following diagram. Note that on OVMA 
cycles, this SB_ERR_ signal could occur after S8_6Q_ has been asserted to another 
device, so that device must take care not to react. 



sb_bg_ 
sb_ack32, 
sb err 



f!l^!^T|'™'''"'™'^g^j!!?gpi^;jj$p;^^ 



I ;;;":JfeSijJiM|^ I •■^'- •■'•1 



mmw 
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S4-Buffer 



The system bus controfier knplements dynamic bus sizing for CPU eyeies. This function Is 
perf omned through the joint efforts of the S4'-Cadie and the S4-8uff er.Taidng the desired 
timnsf er vvidth and the port size ^o account, the bus ^^ntroSer padcs data from narrower 
ports up to the desired width by performing several bus cycles. This byte packing is 
performed cniV for CPU cydes. not for DMA eyeies. Tne cydes appear as separate cycles 
indistinguishable from cycles that don't involve byte packing. 



Transfer Size 


Port Size 


Controller Response 


1-8yte 


Any 


Single BYTE cyde 


2-Byte 


8-bit 


Two BYTEcydes 


• 


16-bit 


OneHALFcyde 


■ 


32-bit 


One HALF cyde 


4-Byte 


8-bft 


Four BYTE cydes 


• 


16-bit 


Two HALF cydes 


m 


32-blt 


One WORD cyde 
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S4-Buffer 

PreOminafy 



J 



Parity Control Register 



0(31:8) 


Reserved 


0(7) 


Parity Error 


0(6) 


Second Error 


0(5) 


Parity Test 



The Parity Control Register provides facilities for enabling and reporting parity errors and 

for testing the parity generation and checking logic. It is a 32-bit read/write register. 

cleared on S8_R£SET_. accessible 8 bits at a tinrte over the 100 bus. It has the following 

fields: 

Read as zero 

Set on any parity error 

Set If 0(7) Is set and new error occurs 

Set to write parity w/ith the inverse polarity 

to test the operation of the parity error 

circuitry. With Parity Test off. correct 

parity is generated on aO memory write 

cycles. 

Enables parity checking 

Records parity error on data bits 31 :24 

Records parity error on data bits 23:16 

Records parity error on data bits 15:8 

Records parity error on data t>its 7:0 

Note that the Error Bits 0(7, 6* 3:0) are not writable. They are set by errors and reset 
automatically when read back. 

Parity Control Register Read 



0(4) 


Parity Check 


0(3) 


Parity Error 24 


0(2) ' 


Parity Error 16 


0(1) 


Parity Error 08 


0(0) 


Parity Error 00 



dk 



9 ^ d ^ ft 7 ft Q in 11 

^rnrnririfniiriririt 



iu shdd WL 



sb_rd r 
par_cs_ 
iod_en_ 
• sb_ack8_ 
k)d(7:0) 



tasti*^ 



"5. 



v ^ 



<m. ^^ 



• sb_ack8_ is generated by the MMU on Parity Control Register accesses. 
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S4-Buffer 

Prefiminary 



Timing Specifications 

Conditions: VC04.7S to 5.25V. TA=0 to +700, Output Load*100 pF 
Symbe! Frofr. To min max unit 





xclk high 


xdk high 


40 


mm^mm 


ns 




dk 


iu d 


10.5 


23.5 


ns 




dk 


sb d 


18 


27 


T\S 




dk 


par 


23 


34.5 


ns 




dk 


pio 


12 


21.5 


ns 




dk 


kxj 


8 


31.5 


ns 




dk 


sb men 


16 


23.5 


ns 




iu mexc 


Hj d 


6.5 


12 


ns 




wb oe 


sb d 


6.5 


21 


ns 



Setup time for all signals is 15 ns. Hold tinne for all signals is 3 ns. 
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S4-MI\/IU 

Prelirr^iary 



Features 



• Provides decodes and timing strobes for a!! 8^0-4 Type 1 devices 

• Replaces aii MMU read/write buffers 

• Automaticaily updates MMU statistic bits during bus cydes 

• Prioritizes 15 levels of Interrupts 

• Sun-4 Interrupt register provides software interrupts, interrupt enable 

• 4-bit context register provides switchabie MMU contexts 

• Two counters generate high-resoUition periodic interrupts 



sb_a(29;18) 



irq 



iod{7:0). 




rifin-n) . 



pmpg(7-n) 




Pfl(?7'1?) 



jyyp- 



mfnn(vswYt 



» mmiKa m)» 



fleeodes- 



iu irl( 3:0) 



sb_a(17:12) 




;g.;;:;::y^jj:jK¥^Jjfi:j^5;gs»^::^;:j:S 
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S4>I\/IMU 

Preliminary 



Functional Description 

Device Space and Control Space 

The SPARC address space identifiers are divided into two "spaces" according to the 
following table: 



ASI 


Function 


Space 


0-1 


Reserved 




2 


lU Extensions 


Control 


3 


Segment Map 


Control 


4 


Page Map 


Control 


5-7 


Reserved 




8 


User Instruction 


Device 


9 


SupeoASor Instr. 


Oevjce 


A 


User Data 


Device 


B 


Supervisor Data 


Device 


C-F 


Reserved 





The signal DEVSPC. chooses between device space and control space address maps. 
Device space devices are accessed %yith physical addresses provided by the MMU. %vhile 
control space devi<^s are accessed %vith virtual addresses provided by the SPARC 
processor. 

Control Space 

CTL(2:0) Encoding (Control Space Address Map) 



ctl(2:0) 


Device 





Device on S4-Cache Chip 


1 


Reserved for VME lACK 


2 


Context Repister * 


3 


Diagnostic Register (unused) 


4 


Serial Controller Chip (MMU Bypass) 


5 


Segment Map 


6 


Page Map 


7 


EPROM (Boot Cycle. Supv. Instr. Fetch) 



* - Context reg access requires AO low. 

In Device Space (DEVSPC_ low) the ctl(O) input is used as an invalidation input for any 
cycle from the cache chip. It is used when the cache chip determines an illegal virtual 
address (a(3l:28) not all ones or all zeroes) which the MMU cannot detect, to inhiOi! 
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though the PM£G(7:0} bus. "Hie foUownng diagram shows a Segment Map write cyde: 




pm_wr(3:0)_ 



sm wr 



Note that only one of these signals is asserted at a time. 



Page Map 

The page nnap is the second level of the two-level MMU. and contains 8k or 1 6k page map 
entries each mapping an 4 Kbyte page, it is indexed by the 7/8--b(t PMEG provided by the 
segment map concatenated with virtual address bits S8_A(17:12). The page map bit 
definition is as follows: 



Bit 


Type 


Description 


31 


V 


vaHd bit. implies read access 


30 


w 


write allowed protection bit 


29 


S 


' supervisor only protection bit 


28 


X 


don't cache bit 


27:26 


MMU_TYP(1:0) 


s> main memory 






1 => input/output space 






2.3 S5> reserved for VMEbus 


25 


A 


accessed (statistic bit) 


24 


M 


modified (statistic bit) 


23:16 


none 


reserved 


15:0 


page 


physical page number 
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S4-MMU 

Preliminary 



Device Spece Address Map 



mmu_typ(1 :0) 


pa 


device 




[2«:»1 







oxx 


RAMSEL (main RAM) 




31.201 • 






^ FOX 


Keyboard/Mouse 




F 1 X 


Serial Controlier Chip 




F2X 


TOO Ok. NVRAM 




F3X 


Counter Registers 




F4X 


Parity Ctri/Aux 




F5X 


Interrupt Register 




F6X 


EPROM 




F7 2 


Floppy Controller 


1 


F73 


Audio OAC 




3:7 A, 


Aux Out Register 
SchoolBus Onboard 




F9X- 






F AX 


Video Onboard 




FBX 




t 


FCX 


SchoolBus Slot 1 




FOX 






FEX 


SchoolBus Slot 2 




^^I^ 




2 


aU 


Unused 


3 


ad 


Unused 



Video Ontxsard 



* PArai:^] v% not •etually 
d a o ootd. but assumad to bm Vt 
on Type 1 acowft and 0' s on 
Typo a 



Main RAM — Statistics Update Cycles 

The operating system requires certain information about the read/write history of each 
page mapped into main memory. The S4-MMU chip maintains tNs information in the 
MMU_A and MMU_M bits, automatically updating them on any reads or writes of main 
memory. A statistics update cycle is shown tselow: 




devspc C 



mmu a 



mmu m 



pm_wr(2)_ 




TZL 



hign tor writo cyc4«s 



Because the PM_WR_ signals will be asseaed in Cycle 3 and negated in Cycle 5. 
addresses must remain stable to the MMU RAMs throughout Cycle 5; the earliest they may 
change is Cycle 6. Statistics bits are tri-stated in Cycle 6 No data collision occurs because 
the addresses do not change; we are reading the data we wrote. 
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S4-MMU 

PreOminary 



Interrupt R«g{ster 

The interrupt Register provides for soft¥mre generation of interrupts and allows the CPU to 
disable aS internists or only certain ones, it b cleared on S8_RES£T_. and has the 
following fields: 
31 30 29 28 27 26 25 24 



Enabto 

Uv«l 

14 

Interrupts 


R«s«fv«d 


Enable 

Level 

10 

IntefTuptc 


Enable 
Level 

a 

interrupts 


Software 

interrupt 

Level 

6 


Software 

kuerrupt 

Level 

4 


Software 

knecTvpt 

Level 

1 


Enable 
interrupts 

Clears Level 
IS when 



All IRQ<13:1)_ signals may be asynchronous to the system dock. 

Software interrupts may be generated on levels 6* 4, and 1 by writing a 1 into bits 27. 26. 
or 25 when Interrupts are enabled (bit 24 high). 

Level 15 Interrupt requests are captured on a dock edge and held asserted to the CPU 
until bit of the Interrupt Register is cleared. 

Note that writing a zero to the Enable bits In ttie Interrupt Register only masks out that 
level's interrupt it does not dear the source (with the exception of Level 15 requests) . This 
is different from the Sun-4 Architecture, in that the periodic interrupts at Levels 10 and 14 
must be deared by accessing their respective Limit registers. 

Level-Sensitive Interrupts: 



irq(level)_ 



i i 



rr 



U1(3:0)_ 



•ncod#tf prtof ity 



Tl 



interrupting Devices (assumed system conf iguratk>n) : 



int Level 


Device 


15 


Buffered WHte Tin>eout Error 


14 


Clock Interrupt 14 from Counter 1 


13 


Bus IRQ13 


12 


Keyboard/Mouse Serial Ports 


11 


Bus IRQ11 Fteppy 


10 


Clock interrupt 10 from Counter 


9 


BuslRQS 


8 


Vfcjeo 


7 


Bus IRQ7 


6 


SWIRQ6 Ethernet 


5 


Bus IRQS 


4 


SWIRQ SCSI DMA 


3 


8us IRQ3 


2 


Unused 


1 


SWIRQ1 Bus inoi 



EPROM 
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Both Counters are separately writeable for testing purposes. They should not be written in 
normal operation. Because of the 8-bit hterface unpredictable carrys could occur. 

Auxiliary Output Registers 

An additional read/write strobe has been added for a set of Auxiliary Output Registers 
located in Type 1 Device space beginning at F7400000. 

DAC Write and Transfer strobes 

The DAC_WR_ and DAC_XFER_ signals are somewhat overloaded. In the power-up 
mode, they are used to access an external double-buffered DAC. The DAC_WR_ signal is 
asserted when the cpu attempts to write to the audio DAC address range, tt is a slow 
device, inserting 7 waitstates. like the SCC's. The DAC_XFER_ signal is asserted %vhen 
counter 1 hits itsHimit register value, transferring the holding register data into the DAC 
internal register. It is asserted for 6 docks or until the interrupt source (Limit 1) is 
removed, whichever comes first. 

When the Intemal DAC is enabled (see below) the DAC_WR_ pin becomes the DAC2 
output. The DAC_XFER_ pin becomes the PWM output. v«ying ki duty-cycle between 
0-5t1 CLXs out of 512. 

In addition, the DAC_WR_ signal is asserted for both reads and writes at location 
0xF7FXXXXX. This is used as an S4-VME chip select signal. 

Internal PWM DAC 

Two 8-bit Pulse-Width Modulation DACs are implemented, operating off of the 40/50 nSec 
CLK input. When enabled, this DAC outputs replace the DAC_WR_ and DAC_XFER_ 
output pins. It responds to the same address space as the external DAC. only faster: Type 
1 Device Space. $F73CXXXX}. 

The output of the PWM DAC is a square wave with a duty cycle between and just under 
100%. When the DAC data register is programmed with O's. the output is never high. 
When it is programmed with $0080 (least-significant tiH of 9-t>it DAC set), the output is 
high for one ctock every 512. When it is programmed with $FF80 the output is high 51 1 out 
of every 512 clocks. 



DAC 
XFER 



Data 



Dae 
Write 



40/50 
nsec 

Clock 



Hole 
Reg 



DAC 
Reg 



Ctr 
> Ccl 



PWM DAC 
Block Diagram 



B 
Compare 

A<a 

A 



Dae Output 
Dae Syne 
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S4-MMU 

Preliminary 



Functional Timing Diagrams 



K€yboard/Mcus€ or SCO Read 



dk 



'-LJiJ-LtiJiJ-LtiJ-iJ\JhJ\Jh^ 



i I 

as ^L 



sb_rd r 

devspcl 

kbm_rd_ 

scc_rd_ 

sb_ack8, 

iod en 



Keyboard/Mouse, or SCC Write 



elk 



as 



^T-fnJT-/TJT-JTJ^T-fx/^ 



sb rd 



devspc 



k3d(7:0) 



kbm wr 



sec wr 



sb ackS 
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itm^m^v 



ii'!??j-if:^^^i!!!'^ie?f^^^ 



S4-MMU 

Preliminary 



SBus, RAM* Of lOSEL Read 



elk 



as 



Hi-U 



h_/-i_f-i_r/ /i_r 

ImM for «6.a 



sb_rd r 
devspcT 



ramsel "n 
sb sel~ ■" 



iosel "1. 



sb_ack(32.8) 

iod en 



J — [ 




iosel_, rams«i_ and tbjt^^ ar« NOT r«f«fenced to any cSc adga. 

Not«I S8_ACK8_ it geifwratad by thiase stave davieas. not by tha S4-MMU cNp. 



SBus. RAM. or IOSEL Write 



sb rd 



kx3(7:0) 



sb_sel 
ramsef 



iosei 



sb_ack(32.8)_ 




Note that sb_ack[8.321_ or st)_err_ sampled by the mmu terminates the select. 
Note-. SS ACK8_ is generated by these slave devices, not by the S4-MMU chip 



■N. 
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Timing Specifications and Diagrams 



Tn 


Description 


min 


mi 


t1 


iu_clk cyde time 


40 




X2 


Setup tifne. as_ signals before dk 


3 




t3 


Hold time, as_ signals after dk 


15 




t4 


Hold time. Class-I signals after dk 







t5 


Setup time. Class-1 signals before dk 


15 




t6 


Delay Ctass-2 to x-sel_ negated 




22 


t7 


Delay Ctass-2 to x-sel_ asserted 




23 


t8 


Synchronous output delay 




22 



iu elk 



as 



Class-1 



f 



Gy> 



G» 



(fg/^ 



Sync-outputs 



Class -2 



x-sel 






r 



K 1^-©-^ 



ymm: 



\<r K-0— >l 



)00000<X 

zxzz 






Class-1 signals are: lo_a[3:0J. ctl(2:0]. devspc_, sb_rd. user_. pmegl7:0] (in). 

pa (27: 12] On). mmu_[vw»am] (in). mmu_typ[1:0] (in). These sIgnaJs are jjsed 

synchronously in this case. 

^ Class-2 signals are: pa(27:12] (in), mmu_[vwsxam] (in). mmu_typl1:0J (in), ctllliO]. 
devspc_. sb_rd. user_. 77>ese signals are used asynchronously in this case, affecting 
outputs sb_sei{3:0j_ and ramsei_. 
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S4-MMU 

PreGminary 



t1 


dk cycle 


t2 


as_ setup to dk 


t3 


as hold from dk 



40 — ns 

15 

2 



Notes: 

1. This timinig specification does not nneet the kJeal requirements for 25 MHz system 
operation. 

- NOTE: iO_OEN_ is asserted only on READS. It is assumed that ail write cycles drive the 
k3d bus. 



Change History 

12/15 tw Config register is gone. 

Counter/Timer is 30 bits. Added lnterrupt_Occurred bit. 

Otag register and bit added. 
12/17 tw Added sb_ack32_. made sb_ack8 artd sb_err B04's. 

Statistics updates tristate 'm Cyde 6. 
12/16 tw Modified Counter/Timer to freerun on reset. 

Moved DACWR, Ctr. Limit. FkDppy to EOl-4. 

12/18 tw Two Counter/Umit register sets, dedicated at Int levels 10 and 14. 

Deleted IRQ inputs 10 and 14. 

Deleted PARA output, multiplexed with od_ input in test mode. 

Diag is now a 6T8. 

Added one more SB_SEL_ signal, deleted vctljcs and vramsel. 

Deleted DMA Starvatton timeout, deleted S8_6G pins. 

Added A2 and 3. gathered Counters and Limit registers in one page. 
12/21 tw DAC_WR Gone. It's now In the Video chip. 

Counter starts at 1 . 
12/22 tw ramsel. vramsel (stiseil} are now combinatorial. 

All inputs are ttl levels. 
12/29 tw PAR_EN_ signal removed. S4-Buffer will use RAMSEL_ instead. 
1/5/88 tw RAMSEL is now fiU of Typel Device Space. 
1/14/88 tw DIAG changed to AUX_WR_. lOSEL changed slightly. 
1/21 tw Added Limit bit to Counter, Moved Counter to EF. Moved SB Slots to 

Type Space. 
H/27 tw TOD is now Just a CS_. Added DAC_XFER to alkjw for double-buffered DAG. 
1 /29 tw SB_SELn__ are now alt asynchronous. 
2/8 tw Removed S8_ACK32_ and SB_ERR_. 

2/23 tw Added DMA_ pin and description. 
2/29 tw Fixed mmu ram write pulse in Pg 4 diagram. 
3/8 tw Added internal pwm dac. IODEN_ documented. 

3/15 tw 4k pages. Changed memory map. 

4/7 tw iosel_ asynchronous. VME select address removed due to lack of use. 

4/18 tw Level 15 interrupts captured and held. Cleared by turning off all interrupts." 
4/19 tw Changed Device Address Map to remove reference to onboard video. 
4/26 tw Video is back. Ignore previous change. 
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Features 

• Single chip interface between Ethernet ( LANC6 ), SCSI ( ESP ) and Sbus 

* Handles 32 bit paddng and unpacking 
' Generic support for 8 bit peripherals 

* Supports externally progranrvnable Sbus 10 

• Low cost 120PFP package 



Io_a(1) 



.bitr 



reset 




Sbjd(31:0) 

sb_br_ 

sb_bg_ 

sb_adc32_ 

sb_ack8_ 

sb_reset_ 

sb_err_ 

sbjnerr_ 

sb^dk 

sb~rd 

sbjseL 

sbjrqj 

5b_siz(2:0) 

sb_as 

papCY) 

pa(3:1) 

od_/par_tst_ 
fast/stow 



io_a(5:2) 



id cs 
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1.0 Pin Description 



Name 



Type 



Description 



Bus Interface 


51 




sb.d(31:0) 


B04TU 


Sbus Data Bus 


sb_br_ 


BT4 


Sbus Bus Request 


sb.bo^ 


TLCHTU 


Sbus Bus Grant 


sb ack32 


B04TNU 


Sbus 32bit Acknowtedge 


sb_ack8^ 


B047NU 


Sbus 8bft Acknowledge 


sb_reset_ 


TLCHTU 


Sbus Reset 


8b_err_ 


B04TNU 


Sbus Error 


sbjmerr. 


TLCHTU 


Sbus Memory Error (IffTIS) 


sbjdk 


'DRVC16 


Sbus Clocfc input 


sb_rd 


804TU 


Sbus ReadAVrHel 


sbLseL 


TLCHTU 


Sbus Select 


sb_lrq_ 


B04T00 


Interrupt Request (open-drain) 


8b_si2(2:0) 


B04TU 


Sbus transfer Size 


SbjBS. 


TLCHTU 


Address strobe (addr is valid) 


paTx:Y) 


TLCHTU 


Physical Address lines (for slave decodes) 


pa(3:1) 


TLCHTU 


Phystoal Address bits 



Ethernet Inteiface 32 



e as 


TLCHTD 


e_hold_ 


TLCHTU 


e hida 


BT4 


e.read 


B04TU 


e_das_ 


B04TU 


e_rdy_ 


604TU 


e_cs_ 


BT4 


e_byte 


TLCHTU 


e a23:16 


UCHTD 


e_ad15:0 


B04TU 


DMA Interface 


16 


d_d7:0 


B04TD 


d_req 


TLCHT 


d ack 


BT4 


d_rd_ 


814 


d__wr_ 


BT4 


d_cs_ 


BT4 


djrq_ 


TLCHTU 


d_reset 


BT4 
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Ethernet Address Strobe 
Ethernet HoM 

Ethernet Hold Acknowtedge 
Ethernet Read 
Ethernet Data Strobe 
Ethernet Ready 
Ethernet Chip Select 
Ethernet Byte marker 
Ethernet High Order Address 
Ethernet Address / Data Bus 



DMA Data Bus 

DMA Request 

DMA Acknowledge 

DMA Read Strobe, (reg read or dma to memory) . 

DMA Write Strobe, (reg write or dma from memory) 

DMA Chip Select for slave register access. 

DMA Interrupt Request 

DMA Reset 
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1.1 BLOCK DIAGRAM 



The S4<-OMA gatearray provides three independent functions: 



1. 
2. 

3. 



Sbus Identification 

Ethernet Interface to the Sbus 

Sbus DMA Channel 



















■MMH 






— 1 


32 


LANCE 
INTERFACE 

« 


NVAjCXnjE 




SB PA(...) 
ADDRESS M — 7^ 

DECODE 7 


MREQ_E 






DATA OUT E 




^- 








DATA IN 










MREQ_D 






1 






SBUS 

AROrrRATlON 


BR_ 












W 


BG_ 






MSC 






14 


16 


DMA 

INTERFACE 

(S-BTT) 

CSfl 

AOOR CNT 
BYTEXm 


} 








NVA OUT 






1—1. 






' 




3_DBUS 


i 


' 


OATA^OUT^C 






MUX 


s 








•- 


^ 


32 














DATA IN 








Q 

< 




































_^ 
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Outing Slave Cycids the S4-0MA takes oomroi of the sb.err. sb.sckS. and sb_ack32.. signals. The 
cxxnbination of responses are as follows; 



8b_ack8_ 


8b ack32_ 


8bjerr_ 


Definition 


1 


1 


1 


Insert wart states 


1 


1 





Error 


4 
1 . 





1 


32-bit port adc 


1 








Eiror 





1 





Rerun 








1 


16— bit port ack 





1 


1 


8-bit port ack 











Reserved 



This table represents aQ possible S8us responses. The S4-0MA gate-array can, however. onJy 
generate those responses marked with a **. 



3.0 Sbus Identification 



This is a mechanism which allows software to uniquely kjentify each Stsus device, since each device 
can have a unique 10. 

Unique ID's will be provided by Sun, The onboard id is hardwired to the 32-bit value feSIOlOl . This 
value will be returned when the ID field is accessed by the lU (and the -id__cs_ pin is tied low) . if the 
id_cs_ pin is pulled high then access to the ID field will cause an external access using the id_cs_ pm 
as a external chip select. Refer to 34 Software Architecture Specification for further details. 
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& sb_as_ 


papCiY) 


lo_a(1) 


Register accessed 


Size 


Type 


Q 


11 





Register Data Port (RDP) 


16-bct 


R/W 





11 


1 


Register Address Port (RAP) 


16-bit 


RAV 



Once the S4-0MA has granted access of it's local bus to the LANCE, the CPU cannot access the 
LANCE until the pending cycles are completed. In order to remove the potential deadlock condition 
¥vhich results, the S4-0MA win cause a rerun according to the table on page 7. 



4.1 Ethernet Interface Block Diagram 



Control 



D(31:0) 



VA(31:00) 



JVA(31:24) 
(R=) 



VA(23:00) 



Ethernet 
Address Latch 



in 



Ethernet Data 
Pack/Unpack 



e as 



e a23:16 



e ad15:0 



Ethernet Interface 
State Machine 



J 



Sync 



e_hold_ 
e_byte 



IT:' 



e_hlda 
e cs 



e_read 
e_rdy_ 
e das 
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5.1 DMA Interface Block Diagram 



S4-DMA 



VA(31:00) 



0(31:00} 



Control 



(31:()0) 



DMA ADDRESS 



I 



(31:00) 



BYTE COUNT 



DMA STATUS/ 
CONTROL REG 



d.lrq. 



(31:00) 



(31:24) 



DATA 

PACK/UNPACK 

REG 



I 



d reset 



DMA 

INTERFACE 
STATE 
MAOUNE 



SYNC 



^ 



d_cs_ 
d_ack 
dlrd_ 
d wr 



d req 



d d7:0 
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5,3 DMA Control^Status Register Assignments (DMAJJSR) 



Bit 


Mnemonic 


Description 


Type 





WT_PeND 


Set when djrq_ or TC asserted. Reset vrhen not 


R 


1 


ERR^PEND 


Set when mem. exc occun-ed DMA stopped 
Reset on FLUSH command 


R 


3:2 


PACK.CNT 


Number of bytes in Pack Register 


R 


4 


WT^EN 


When set enables djrq_ state onto sb.lrcL. 


RW 


S 


RJUSH 

• 


When set causes PACK CNT. ERR PEND and TC 
to be reset. Reads as 


W 


6 


DRAIN « 


When set causes remaining padc register bits 
to be drained to memory. PACKjCNT « 00 

Clears itself 


Rm 


7 


RE^T 


When set acts as a hardware reset. 


RAV 


8 


WRTTE 


DMA direction; Is to memory s from memory 


RAV 


9 


EN^OMA 


When set allows the device to respond to 
DMA device requests 


RW 


10 


f^Q_PENO 


When set the DMA i/f is active. 
DO NOT assert RESET or FLUSH 


R 


12:11 


BTTE^ADOR 


Next byte nunrber to be accessed. 


R 


13 


EN_<XT 


When set enables the internal byte counter, 
(not used with the ESP SCSI chip) 


RAV 


14 


TC '• 


Terminal Count. Byte counter has expired 


R 


15 


ILACC*** 


When set this bit instructs the ethemet interface 
to act sfiohtiv differentfv — see note below 


RAV 


27:14 




Reserved (all unused bits to read as 0) 


R 




31:28 


DEVOID 


Device D (for this implementation s 1000) 


R 



• RESET 

POWER_ON RESET or RESET from bit 7 will leave the device in the following state: 

ERR_PEND = PACK_a^ = INT_EN = FLUSH = DRAIN = WRITE = EN_DMA = REQ_PEND = EN_CNT = TC 

0, RESET = 1. and BYTE ADOR = 00. All interface state-machines will revert to their idle states 
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5.6 Programming Notes 

The address counter always points at the next nnemory location to be accessed. When the direction of 
transfer is to memory the counter is incremented t>y the size of the %vrite ( 1 or 4) upon completion of 
the transfer. When the direction of transfer is from memory the address is always incremented by 4. 
but the lower 2 bits are driven low such that an reads are word sized and word aligned. Byte alignment 
is done inside the gate-array. 

There is jb 2-bit byte counter BYTE_AOOR that always points to the next byte location that the DMA 
device vnU access. This counter is incremented by 1 each time a byte is transferred between the 
external device and the gate-array. Note the byte counter is controQed by the DMA interface whereas 
the address counter is, controlled by the memory Interface, hence the two may disagree. This byte 
counter is loaded at th^ same time the address is loaded and receives the two least significant bits of 
the address. 

Another 2-bit counter PACKjCNT keeps track of how many bytes are stored in tfie internal PACK 
register. Note this pack count is only vaGd for transfers to memory. Whenever the PACK_CNTs 3 and 
another byte is accepted, a word write is scheduled with the memory interface. If a DMA transfer 
completes leaving a non-word fragment in the PACK register, then this counter b used by the hard- 
ware to determine how many bytes to write to memory when the DRAIN command is received. Both 
PACK.CNT and BYTE.AOOR can be read in the Control and Status Register {DMAJOSR), 

If the driver desires to terminate a transfer, two control bits in the DMAJ^SR can be used. The 
EN^DMA bit can be used to ignore new transfer requests from the DMA device when it is cleared. 
MerTHxy accesses by the memory interface are unaffected by this bit. Th EN_OMA bit can be set or 
cleared at any time without affecting the state of a transfer currently in progress. The FLUSH bit is 
provided to dear the PACKjCNT if the driver wishes to dean up tfie state of a transfer, without draining 
the packed data to memory. It is also used to dear the EfV)_PENO nidicator. altowing an error condi- 
tion, which subsequently halts the DMA irtterface state machine, to be deared deaniy. 

Th DRAIN tk win cause aM packed data to be sent to memory. This is intended for use wtien a transfer 
completes and the data for tranter to memory does not fill the 32 bit word. It can also be used to 
leave a transfer in a dean state if a transfer is stopped via the EN_DMA bit, wtik:h may be restarted 
later. A DRAIN sequence win leave the address counter pointing to the byte address beyond the last 
byte or word writt«i. Hence the eddress counter must be reloaded tjefore ttie next transfer to property 
set the BYTE ADDR. 



The DMAJOSR also contains a RESET bit which wiH generate an external reset signal and reset all DMA 
interface logic (state machines) . It is vital the RESET and/or FLUSH bits are not set if any memory 
activity is still pending: a REQ_PEND bit Is provided in the DMA_CSR to show the driver if the memory 
interface is active. If REQ_PEND is asserted the driver should poll it until it is deasseaed. Simdariy 
writing to the Address Counter, changing the WRITE bit in the DMA_CSR. or writing the Byte Ccun:e' 
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6.0 Timing DiagrarTiS 



sb dk 



sb br 



n_P4iJiJiJiJixiJiJiJiJiJiJT^^ 



sb_bQ_ 



8b.d(31:0) 
(read) 

sb.slz(2:0) 



sb_fd — 

sb_ack32_ 
sb merr 



sb_d(31:0) 
(write) 



sb rd 



/ 



<^ 



<1 



J 



cp 




{SaXZZd 



SBus DMA READ Cycle 



> 



\-r 



parity error ^KJicator 



1 

For further possible SBus cycles see the SBus specification 



SBus DMA Write Cycle 
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S4-0MA 



sb dk 



ruL 



jjwid. V. 



e_das_ 
e read 



1^ 



e 9A\^ ^//////A 



e.a23:16 V/////A 
e_rdy_ 



■Lfl 



i 





JIJ" 



|aDOffgSSX 



iOORESS 



JIJIJIJT I IJIJIJIJIJIJI 



1 



'//////y////>^ . i's^ 






% — ' 



/• 



8(His min. 



75ns min. 



120ns nnax. 



250ns nnax. 



[1] e_hlda_ is only asserted when the interface is not busy 

[2] e_hokj_ wa stay asserted for burst mode accesses e_htda_ must follow it 



Ethernet: LANCE DMA read cycle 

( DATA avalable In PACK Reg ) 



(21 



/ 



'%^//////////A 



y(////////A 



r 
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sb dk 



njiJiJixLmiJT^^ 




d.d7:0 \V/////////^/>^ haVa I yZoT^ 



DMA ReadCyde [fastcyde] 



( DATA available in U^PACK Register } 
*R£AO' indicates transfer from niemory to DMA device 

When UNPACK reg is empty a memory read must occur which wilt subsequently lengthen 
this operation (see SBus READ Cycle) 






sb elk 



LJijnjTJT i iJiJiJiJiJijn_rLrL_ 




50ns max. 



DMA Write Cycle [fast cycle] 

'Write * indicates transfer is from DMA device to memory 
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S4-DMA 



^-<* njnjijijiJiJiJijTJi_rLrLri_ 



sbjsel, 
&sb as. 



ru 



sb aOCYTnf 



h3 



id_C8. 



d rd 



^ 



d_«J7:0 \////////////////j 



^ 



8b ack32_ 



J I 



^ 




^^i ^////////7777^ 



Offboard IDreadcyde cfastcydej 
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so CIK 



njiJiJiJiJiJiJiJijnji_rLrL. 




I I i 

d_d7:0 k<^>/^^^^^^^^^^^<^^!^^44^y^y? 

* 

sb ack32 



^=^ f/////////Z^7A 



\w- 



Extended DMA device register read (^de [slow cydei 



^-^ njiJijnjiJiJiJiJiJiJiJiJT_ 



sb_sel_ 
&sb as 



sb_a(31:oQ( 




^ f/j^///yy>yj^>^^v/yVA 



d.d7:0 X///////////7ii { 



Vi U////AV////////////A 



sb_ack(32/8)_ 

Extended DMA device register write cycle [siow cycle] 
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Switching Characteristics 



No. 


SIGNAL 


OeSCRIPTTON 


CONOmONS 


rnkx 


max 


units 


1 


CLK 


dock period 




30 




ns 


2 




dockNgh 








ns 


3 




dock low 








ns 


4 


Notel 


hoid wrtdk '^ 









ns 


5 


Notal 


setup todk' 




14.0 




ns 


6 


Notel 


setup to dB^ * 




23.0 




ns 


7 


Notel 


hoWwrtdk- 




5.0 




ns 


8 


Note 1 


setup to dk * 




13.5 




ns 


9 


Notel 


tx3kj wrt dk* 









ns 


10 


Note 1 


cOc * to output vafld 


Load- 
lOOpf 




30.4 


ns 


11 


Note 1 


oik * to output invaOd 


Load s 
lOOpf 




22.0 


ns 


12 


Note 1 


dk '^ to output vafid 


Loads 
130pf 




31.4 


ns 


13 


Note 1 


dk * to output invaikJ 


Loads 
130pf 




19.7 


ns 


14 


Notel 


dk * to output low 


Load « 
lOOpf 




24 


ns 


15 


Notel 


dk ' to output high 


Load s 
lOOpf 




18.5 


ns 


f 












ns 














ns 
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S4-DMA 



No. 


SIGNAL 


DESCRIPTION 


CONOmONS 


min 


max 


units 


36 


e.ad[15:0] 


sattup to dk * Note4 




1.0 




ns 


37 


e_adl15:01 


hold wrt to cttc * Note4 




4.0 




ns 


38 


e.ad[15:0] 


cik' to output valid 


60pf 




36.0 


ns 


39 


e.8d[15:0] 


dk * to output invalid 


80pf 




25 


ns 


40 


e hida 


dk " to output high 


80pf 




18.0 


ns 


41 


9 hida 


dk ' to output k3w 


80pf 




21.5 


ns 


42 


•.read 


4 

dk '^ to output vaGd 

* 


80^ 




15.5 


ns 


43 


ejread 


dk " to output invalid 


80pf 




12 


ns 


44 


ejdas_ 


dk * to output valid 


80pf 




23.0 


ns 


45 


e das 


dk * to output invaOd 


80pf 




18.5 


ns 


46 


e_fdy_ 


dk " to output valid 


80pf 




23.0 


ns 


47 


«_fdy_ 


dk ' to output inva&d 


80pf 




17.5 


ns 


48 


e_cs_ 


dk " to output high 


80pf 




15.5 


ns 


49 


ejcs_ 


dk ^ to output low 


80pf 




20.0 


ns 


50 


e_rdy_ 


settup to dk " 









ns 


51 


S-'^. 


hdd wrt to dk * 






2.8 


ns 


52 


e.ad(15:0] 


AOOR settup to e_as_ low 






15.0 


ns 


53' 


e„ad(15;01 


ADOR hdd wrt e_as_ high 









ns 


54 


e_hold_ 


settup to dk * 









ns 


55 


e_hoid_ 


hold wrt to dk * 






4.0 


ns 
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S4-DMA 



Timing Diagrams 



-^- 



sb dk 




.-^-^ 



SbjBS_ 

sbjBeT 
sbJbQ_ 



sbj>atX,Y.3.2.1] 

sb read 



sbjJbus [31:01 



sb siz2 
sblsizi 
sb_sl20 

sb_ack8_ 
sb_ack32_ 
sb err 
sb merr- 



1 p. 

U 3 » 



X 



K 



8 



\ 



\ / 



4 ► 



SBus Input Signals 



X 



>< 



9" 
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sb dk 



hh^iKAiVP 



djcs. 
id'cs 



d rd 



3 



d_dbus[7:03 



d Yff 



d_dbus[7:01 



d refiet 



\l 



22 



30 



i 



24 



25 h<— 

r 



■( 



26 




29 



23 



r 



/^ 



31 
J 



27 



>. 



33 



\ 



D_Bus ReadA/Vrite Cycle Timing (slow = high) 
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sb dk 



nJnj~L/TJ^/\-nJ 



e cs. 



e read/write 




WWTE Cydd 



Note1 Refer to LANCE timing specs 



LANCE READ and WRITE Cycle Timing 



S4-DMA 7/26/88 



Sun Confidential 



Pace 



Notel : These values represent the timing characteristics of groups of signals. By referring to the 
Timing Diagrams it can be seen that one mnemonic value can represent nnany different signal paths. 

Note2: These timing parameters are true for both the signals djcs_ and id_cs_. 

Note3: The documented values represent the timing of an external device ( in this case the ESP 
SCSI chip } , to wfuch this gate-array is matched by design. 

Note4: The settup and hold times refer to the tim&ig ^gram on which they are shown, and in 
particular to the dock edges shown. The ejad bus is designed to be that of the LANCE Ethernet 
controller. Internal to this chip the ejad bus is not latched for at least 2 dock cydes to alleviate any 
potential timing problems. Hence the 0ns timing requirements shown are true only if the cyde by 
cyde handshaking spedfied by the LANC^ is maintained. 



8.0 Revision History 



12/22/87 
2/11/88 



First Release. 

Remove sb_address bus and multiplex addr/data on sb_d bus. 
Add SBus kjentification infomfiation. 



3/28/88 

6/21/88 
7/26/88 



Revised pinout. Corrected errors in register addressing. Added more info on 
programming. Updated timing diagranrrs. Revised block diagrams. Added 
register in MSbyte of ADOR_CNT. Revised operation of Terminal Count bet. 

Added timing specs. 

Induded post_route timings 
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S4-Video 

Preliminary ' 



J 



Features 

• Singlenihlp video 'subisystem "•.•■-•• ^. -. . ..... .....;.. 

■ Directly interfaces to.Sbus Interface . 

• Supports 256"4. 128K*8, an<i e4K*4 Video RAM 

" Supports l-bit, 8**it, and 24-bit per pixel frame buffers 
" Fully programmable video timing and resolution 
" Supports up to 4 video clocks (software selectable) 

• Supports Sun Video Monitor sense lines (for auto configuration} 

• Directly interfaces to VRAM and RAMDAC (no external components required) 
" Built-in Video Shifter for 1 -bit frame buffers (mawmum pixel dock AGO MHz) 

• Low-cost 120PFP package ... 



»— o 



^AS-t 

POR_ ' 

|>AnA. ^'^^•^^^^ 

TEST 




S4-Video 8-bit Frame Buffer Application 
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VRAM Interface 


20 


yma(8:0) . 


BT4 


vcas<3:0) • •• - 


BT4 


vras(3:0)_ 


BT4 


voe__ 


BT4 


vwe 


BT4 


vsc 


BT4 


Misc Pins 


8 


mode (3:0) 


TLCHTD 


type 


TLCHTD 


por_ 


TLCHNU 


pafa^--^- ,.V ..::-^.^: 


B01TU 


Device lumber: 


LMA9141 


Package Type: 


PFP120 



Video Multiplexed Address 
Video Ca» Enable <Byte select) 
Video Ras Enable (Bank select) 
Vkleo Output Enable 
Video Write Enable 
Video Shift Clock 



memofy mode and configuration 
VRAM typer 0:256K. 1:1Mbit 
Power On Reset. Clears ccMitrol register 
'Parametnc Test' Output and .Output Oisatsle. 



(PAD: 118. VDD:6 VSS:8. 10:104) 
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Address Space Decoding 

Ihis section expiaiQs.the-S4^ykj6a. register., a On Re^t. ttie. master controi 

regfciter is Inctlalized to Q. A!( other registers are not initialized. 

Master Decoding (A23: 22} 

When the S4-Vida? chip is selected, address A (23: 22) define the master mode as follows: 

0x000000-0x3FFFFF Sbus ID 
.0x400000-(^7FFFFF. - Video Registers .... 

Ox80OCX3O-OxBFFFFF VideoRAM 
,.OxCXXXXX)-OxFFFFt:.F. , . jRies^ye^^ ,^,..,, .^ . . .,.,,.,,.. . .,._. ^ 



Sbus 10 



The Sbus ID is ^ttvBr internal in the S4-Video chip or provided externally, as determined by the. 
status of X_CS_ at the end of POR_. If X_CS_ is grounded externally, the ID will be provided 
internally and read as OxFEOIOlOy, where (y) is MODE (3:0).. If XjCS_ is not grounded externally 
then the Sbus ID vmII be provided by an external PROM that is selected by X_CS_. The PROM can 
have a size up to 4 MBytes. 

Video Registers 

The video registers start at the four megabyte (OxAQOOCXi) txxjndary and extend up to the 8 
megabyte boundary. There are a total of 16 registers, including the external DAO. which are 
decoded with IOA(4:0). 

Video RAM 

The frame buffer is decoded at the eight megabyte boundary (Ox8(XX300) up to the twelve 
megabyte boundary (OxCOOOCX)) . It is up to software to map in only memory that is physically 
present on the frame buffer. 

Reserved 

Accessing this area will return an ACK but cause no actions on the chip. This area can be used 
to provide "dummy pages"' for software. 

Interrupt 
t 

When enabled. Interrupt is asserted at the beginning of vertical blank. The interrupt is cleared by 
writing to the (read-only) status register. 
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Video Register Description 



DAS Select 



•A-*.-^.. -»;<«i^jj ^. [T/wiv.*-* tiira 



Accessers to these addresses are passed through to the esdternal OAC for reading and writing of 
the D AC registers. ... 

Master Control Register 
Bit Function 

7 - Enables interupts. When enabled, the S4-Video chip^ will^.generate an interupt when 
'-' "the tend of the end of tfie -frame is reached (start of VBLANK)..The interupt is cleared 
by reading the status register. 

6 Video Enable. When set to 0, the blank output is constantly asserted independent of 
the internal counters. When set to 1, the VBLANK output foltows what is 
programnned into the timing registers. 

5 Timing Enable/Slave Mode. When set to 0. the internal video timers are disabled and 

the internal state-machine that controls the transfer cycles is triggered from the 
external inputs XREQ and XCLR. When set to a 1 , the video chip generates timing 
based on the values programmed and drives the XREQ and XCLR pins as outputs. 

4 Cursor Enable Register. When set to a 1 . accesses to the frame buffer will cause a 

bu^ error if the address is within the range of the two address values programmed 
into the Cursor Start Address and the Cursor End Address Registers located at 
0x400012 and 0x400013. 

2:3 Oscillator Select. Selects one of the three XI inputs as the source for the video 

timing. Selecting input 4 ( 2:3 = 0x11) causes the video logic to stop. 

0:1 Divider. Selects a divide by 1. 2. 3 or 4 of the selected XI input. 

Status Register 
Bit Function 

7 Interupt Pendir^. An interupt was generated by the chip. 

4:6 Monitor Sense. These three bits come directly from the three SNS inputs to the 

S4-Video chip. Usefull for determining the type of monitor connected then frame 
buffer. 

0:3 Memory Mode. These four pins come directly from the MODE inputs the S4-Videc 

chip. Usefull for determining what type of memory the frame buffer uses. 
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VBSH 

The VBSH register contains the high order bits of the line number to start vertical blanking on. 
The vertl<ar counter IS a" 12" bit counter which' requires two registers to- program. Th^ four least 
significant bits of VBSH are used with VBSL to form the 12 bit line count. The four high order bits 
of V^H are don't cares, and are read as zero. 

VBSL 

The VBSL register contains the tow order bits of the line number to start vertical blanking on. The 
vertical counter is a 12 bit counter v^ich requires two registers to program. The four least 
significant bits of VBSH are used with VBSL to form the 12 bit line count. The V8S registers are 
programmed in multiples, of lines. 

VBC 

The VBC register contains the vertical blank end value. It is programmed in multiples of lines. 
When the vertical counter reaches this value, the composite blank (OAC_BLK_} goes active. The. 
value for VBC must be programmed to be less than VBSH -i- VBSL. 

VSS 

The VSS register contains the vertical sync start value. It is programmed in multiples of fines. 
When the vertical counter reaches VSS. the vertical sync output (VSJ goes active. VSS must 
be programmed to be less than VBSH + VBSL. and should be less than VBC. 

VSC 

The VSC register contair^ the vertical sync end value, it is programmed in multiples of lines. 
When the vertical counter reaches VSC, the vertical sync output (VS_) goes inactive. VSC must 
be programmed to be less VBSH + VBSL. It must also be programmed to be greater than VSS 
and should be less than VBC, A basic vertical sweep with respect to the vertical counter should 
look something like: 

0.... VSS.... VSC... VBC VBSH + VBSL 

xcs 

The XCS register contains the transfer hokj off start value. It is programmed in multiples of 8 
pixels. The S4-Vid«3 chip generates transfer cycles as necessary by counting shift clocks. The 
shift dock is inactive during horizontal blanking however. If an access to the frame buffer or any 
of the internal registers were attempted during a horizontal blank which occurred during a 
transfer cyde. The S4- VIDEO chip would not be able to respond until after the blanking and 
transfer were completed which in computer time could be a very long time degrading 
performance. The XCS and XCC registers allow for a window to be programmed around relative 
to the horizontal blank window (defined by HBS and HBC) which will prevent a transfer cycle from 
starting until late in the horizontal blank period, thus allowing other accesses to the video chip in 
the mean time. 

The vaiues for XCS ane XCC will be iess than HBS and HBC respectively and will depend greatly 
on the relationship between the system clock and the video clock. The most important timing 
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occurs before. XCS (and HBS) then the request is processed, tf the request occurs after XCS 
but before HBS, then the S4-video chip suspends the transfer cycle until XCC. This will allow the 
lU tp contiTMjKi witii other a<x:esse^ during the Jb^ 

Memory Controller interface 

The video controller interfaces to the memory controller via two signals: XREQ and XCLR. 

XREQ (Transfer Cycle Request) forces a video RAM reload cycle using the address of the 
transfer counter. Asserting XREQ causes the memory controiier to begin a video reload cycle as 
soon as current cycles are complete. The mernory controller will wait until XREQ drops, 
asynchronously deassert DT/OE and then complete the video reload cycle. This allows 
on-the-fly video reload cycles. 

XCXR (Transfer Clear) clears tfie reload counter and the transfer counter and forces a minimum 
length, reload cycle. XCLR is asserted in the state following (V6C & HBS). 

When the S4-video chip is in the master mode (control register bit 5 = 1 ) the XREQ and XCLR 
signal pins are driven as outputs mirroring the internally generated transfer requset and transfer 
clear signals. These two signal pins can then be connected- to a parallel S4>Video chip which is 
configured in the slave mode (control register bit 5 - 0) to synchronize the two chips. When in 
the slave mode, the XI^Q and XCXR signal pins are treated as inputs to the internal 
state-machines for synchronization purposes. 
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Video RAM Interface 

The S4-Video <x>ntroiier generates its video memory. outputs as; follows:.. . . 

VRAS(x) = RAS • (VIDEO + CPU * (BANK{1:0) — x)) 

VCAS(x) = CAS • (VIDEO + CPU * (SIZ=1) * (BYTE = x) 
+ (SIZ=2) ' ((BYTE «=x)+(BYTE=x+1) 
+ (SIZ=3) ' ((BYTE=x)+(BYTE=x+1)+(BYTE=x+2) 
+(SIZ=4) • ((BYTE=x)+(BYTE=x+1)+(BYTE==x+2)+(BYTE=x+3))) 

VMA(8:0) = MUX * (VIDEO * X(8:0) + CPU • ROW(8:0) 
+ IMUX * (VIDEO • + CPU • COL(8:0) 



Cycle 


CPU 




Video/Refresh 




row 


col 


row 


col 


vmaO 


rowO 


colO 


xO 





vmal 


rowl 


coll 


xl 





vma2 


row2 


col2 


x2 





vnui3 


row3 


col3 


x3 





vma4 


row4 


col4 


x4 





vma5 


row5 


col5 


x5 





vma6 


row6 


col6 


x6 





vma7 


row? 


col7 


x7 





vma8 


row8 


col8 


x8 






Size 


Address 


Byte(O) 


8yte(1) 


Byte(2) 


Byte (3) 


sb.siz(1:0) 


sbj3a(1 :0) 


CAS(O) 


CAS(1) 


CAS(2) 


CAS(3) 


0,0 


0.0 


X 


X 


X 


X 




0.1 




X 


X 


X 




1,0 






X 


X 




1.1 








X 


0.1 


0,0 
0.1 
1.0 
1.1 


X 


X 


X 


X 


1.0 


0.0 


X 


X 








0,1 




X 


X 






1.0 






X 


X 




1.1 








X 


1.1 


0.0 


X 


X 


X 






0.1 




X 


X 


X 




1.0 






X 


X 




1.1 








X 
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Timing Diagrams 



The S4-Video controller supports four basic types of cycles: Refresh cyde. Video Cycle. CPU 
cycle, and Burst Cycle. For both the refresh cycle and the CPU cycle, the S4 RAM controller 
supports two VRAM speeds via the speed input: fast and stow. The fast mode supports a 
minimum cycle of 4 states for CPU cycles and 5 states for Refresh. In stow mode. RAS is 
extended by one additional state. This allows to use slower RAMs at the cost of an increased 
cycle time. There is no separate fast and stow nxxje for burst cycles. 

Cyde Overview. TTie S4-RAM controller stays in the idle state (SO) until either activated by a 
refresh request, causing a refresh cycle, a video request, causing a video cycle, or by a CPU 
select, causing a CPU cyde. In case of simultaneous video, refresh, and CPU request the video 
request is the highest priority and the refresh request second. 

CPU cycles are initiated when a select RAM signal is received in conjunction with a matching set 
of addresses (see address decoding table). In response to the CPU request, the RAM controller 
activates RAS for the bank of menrKsry decoded by the addresses. 

Refresh cycles are generated internally by a refresh request which occurs every 320 system 
docks. For a 20 MHz system dock, this is one refresh cyde every 16 usee. 

Video cycles are initiated by a transition on input XREQ. whk:h is generated by the video 
controller whenever a video transfer is necessary. 
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CPU Cycle 

in response to a select, the RAM controller enters state SI and asserts RAS for the tsank of 
memory decoded by the addresses. Tne row/column addresses are multiplexed on the 
half-state foflowing RAS. In state S2, ttie RAM controller asserts CAS and acknowledge signal 
VACK. Following S2. In fast mode the RAM controller finishes up with StO which deasserts RAS 
and VACK while keeping CAS asserted. In skyw mode, the RAM controller extends RAS in state 
SB, and then deasserts aU control signals in state S10. In both cases, write data (WOATA) must 
be valid at beginning of CAS and read data (ROATA) is valid at the end of CAS. 



CLK 

ISEL_ 

VRAS_ 

VMA 

VCAS_ 

VACK_ 

ROATA 

WDATA 

VWE_ 
(write q 

VOE_ 
(read c^ 


SO 


SI S2 S3 


S4 SO 


1 L 


1 1 




\ 






-^ 






T4 










XT6 


X 




1 T5 


J 








1 1 


■11 




•• 


■"■ 


■ 




IHHHIBiiiJiiBBH 






/cles only) 


T6 






^cles only) 


1 T6 
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Refresh Cycle 

Refresh is Implemented with a "CAS-before-RAS" cyde. Once a refresh request is recognized, 
an CAS outputs are asserted during state RO foliowed by all RAS outputs asserted at state R1 . 
REF. RAS, and CAS stay ass^ted during R2 and are deasserted in state S10. In "slow" nrxxje. a 
state 'S9 is kiserted that extends all contrc^ signals for one extra state. Refresh request takes 
priority o^^r CPU cycles that arrive at the same time. Pendir^ CPU cycles have to wait until they 
are recognized in SO. 



SO 


R1 R2 R3 S3 


S4 


CLK 


r — 1 




RREQ "X 






/ 




REF_ * 


n 


J 






VCAS. 










VRAS_ 

' RREQ_ 


1 




and REF_ are internally generated signals 





S4-Video 10/6/88 



Sun Confidential 



Page 19 




S4-Video 

Preliminary 



Table of Monitor Timings 

Type 5 4 3 2 1 

Sun Sun Sun Sun Apple 

16Q0c1280 1280JQ24 1152.900 1024.768 1152.870 





Apple 

640.480 



Unit 



HRes 


1600 


1280 


1152 


1024 


1152 


640 


Pixel 


VRes 


1280 


1024 


900 


768 


870 


480 


Pixel 


POock 


200.00 


135.00 


92.9405 


70.400 


100.00 


30.000 


MHz 


HOock 


89.00 




81.80 


53.66 


68.700 


35.000 


kHz 


HTime 






16.182 


18.64 


14.56 


28.5714 


usee 


VCtock 


67.00 




65.96 


66 


75 


66.666 


Hz 


VTime 






15.163 


15.15 


13.32 


16.000 


msec 


Register 


Value 


Value 


Value 


Value 


Value 


Unit 


Conversion 



HBS 
HBC 
HSS 
HSC 
VBS 
VBC 
VSS 
VSC 

Notel: 
Note2: 
Note3: 



1504 

352 

16 

144 

937 

37 

2 

6 



1312 

288 

1312 

160 

813 

45 

(2) 

6 



1456 

304 

32 

160 

915 

45 

3 

6 



HSCO * HSC, HSC1 = HBS - HSC 

VBSH = (X OIV 256). VBSL = (X MOO 256)-1 

Values in parenthesis are estimates at this time. 



(856) 

(216) 

(32) 

(48) 

525 

45 

(2) 

(8) 



(X/8)-1 

(X/8)-l 

(X/8)-1 

(X/8)-1 

(X)-1 

(X)-1 

(X)-1 

(X)-1 
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Rivision History 

Date Change 



7/18/88 
7/20/88 
7/22/88 



10/4/88 



10/6/88 



Rrst Release. 

Defined XCS and XCC registers for use in delaying transfer cycles. 

Added cursor registers, corrected timing diagram labels, 

corrected pin counts, removed diagnostic register, fixed address 

mappings for 64K x 4 VRAMs. 

Added XREQ and XCLR, removed xi(3) and FAST, removed all 

timing diagrams related to FAST mode, added words atx)ut 

synchrnous operation of two chips, added vrards about cursor start 

and end address registers. 

Removed confusing wording atx3ut cursor address registers in 

in description of control register 



By 

AVB 
MWI 



MWI 



MWI 



MWI 
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